Where to find the benchmark datasets (MemPrivacy-Bench / PersonaMem-v2 .jsonl)?

Hi MemPrivacy team, thanks for the great work!

I'd like to run the evaluation in this repo, but I can't find the benchmark data. A few questions:

## 1. MemPrivacy-Bench / PersonaMem-v2 data
`evaluation/eval.py` reads from a local file:

```python
input_file = 'test_mem_privacy_annotated_final.jsonl'
```

But this `.jsonl` file does not seem to be committed to the repo, and I can't locate it. The appendix (`supplemental_material/MemPrivacy_Appendices.pdf`) describes:
- **MemPrivacy-Bench**: 200 synthetic users from PersonaHub seeds, bilingual (zh/en), train split 160 users, plus a test split
- **PersonaMem-v2 evaluation split**: 20 users, 2,521 turns, 2,378 privacy instances, 563 QA pairs

**Could you please release these datasets (the annotated `.jsonl` files), or point to where they are hosted?**

## 2. Hosting location
The HuggingFace collection (`IAAR-Shanghai/memprivacy`) currently lists only the 4 models (1.7B/4B SFT/RL) and the paper — no datasets. Is the benchmark planned for release on HuggingFace Datasets / ModelScope, or elsewhere?

## 3. Expected schema
Could you confirm the expected fields of each JSONL record? From the code it looks like: `dialogues`, `uuid`, `metadata`, `questions`. A small sample record would be very helpful for reproducing the eval.

Thanks a lot!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where to find the benchmark datasets (MemPrivacy-Bench / PersonaMem-v2 .jsonl)? #3

1. MemPrivacy-Bench / PersonaMem-v2 data

2. Hosting location

3. Expected schema

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Where to find the benchmark datasets (MemPrivacy-Bench / PersonaMem-v2 .jsonl)? #3

Description

1. MemPrivacy-Bench / PersonaMem-v2 data

2. Hosting location

3. Expected schema

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions