Mixed-Session Conversation with
Egocentric Memory

Jihyoung Jang, Taeyoung Kim, Hyounghun Kim

UNIST AI
Findings of EMNLP 2024

Abstract

Recently introduced dialogue systems have demonstrated high usability. However, they still fall short of reflecting real-world conversation scenarios. Current dialogue systems exhibit an inability to replicate the dynamic, continuous, long-term interactions involving multiple partners. This shortfall arises because there have been limited efforts to account for both aspects of real-world dialogues: deeply layered interactions over the long-term dialogue and widely expanded conversation networks involving multiple participants. As the effort to incorporate these aspects combined, we introduce Mixed-Session Conversation, a dialogue system designed to construct conversations with various partners in a multi-session dialogue setup. We propose a new dataset called MiSC to implement this system. The dialogue episodes of MiSC consist of 6 consecutive sessions, with four speakers (one main speaker and three partners) appearing in each episode. Also, we propose a new dialogue model with a novel memory management mechanism, called Egocentric Memory Enhanced Mixed-Session Conversation Agent (EMMA). EMMA collects and retains memories from the main speaker's perspective during conversations with partners, enabling seamless continuity in subsequent interactions. Extensive human evaluations validate that the dialogues in MiSC demonstrate a seamless conversational flow, even when conversation partners change in each session. EMMA trained with MiSC is also evaluated to maintain high memorability without contradiction throughout the entire conversation.

A sample of our MiSC. The main speaker collects each speaker's memory from the main speaker's perspective at the end of each session and utilizes this memory to proceed with the conversation in the following session. The memory referenced when generating utterances can be identified through symbols, and connected memories are represented by the same symbol.

Motivation

Open-domain conversations have generally evolved in two directions:

Width:	Expanding the number of participants involved (e.g., multi-party conversations)
Depth:	Sustaining conversation continuity over time (e.g., multi-session conversations)

Despite these advancements, there has been little research that addresses both breadth and depth together. This gap is addressed by the concept of Mixed-Session Conversation, a novel approach that combines both aspects. In a mixed-session conversation, the main speaker engages with different partners across multiple sessions. This contrasts with traditional multi-session conversations, where the speaker interacts with the same set of partners throughout. By introducing new conversation partners in each session, mixed-session conversations bring a dynamic, evolving interaction that enriches both the breadth and depth of the conversation.

MiSC

We propose MiSC, the first dataset to implement mixed-session conversations. Each episode in MiSC features one main speaker interacting with three different partners across six continuous dialogue sessions. This dataset allows for dynamic interactions between the main speaker and different conversational partners, distinguishing MiSC from traditional multi-session or multi-party datasets where speakers engage with the static partners throughout.

Egocentric Memory

In mixed-session conversation, the main speaker must retain context across multiple sessions with different partners. To achieve seamless interactions in each session, the main speaker requires a separate memory mechanism capable of handling these complexities-something that hasn't existed before.

To address this, we propose a novel memory manage mechanism called Egocentric Memory, which summarizes the memory elements of both the main speaker and their conversation partners from the main speaker's perspective during each session. However, simply summarizing these memories is insufficient as the sessions progress. Thus, we introduce the concept of 'memory links', ensuring that updates to the memory content remain interconnected across related memories, maintaining coherence throughout the conversation.

EMMA

We introduce EMMA, a new dialogue model trained with MiSC and built on egocentric memory, allowing to expand the conversation network both in width and depth from all perspectives.

EMMA collects memories of each conversation partner from its own perspective in every session, ensuring seamless continuity in subsequent sessions. It is composed of two modules: (1) a dialogue module and (2) a retrieval module, utilizing a total of 1B parameters (780M for the dialogue module and 220M for the retrieval module).

Examples

• A human live chat example where EMMA uses Egocentric Memory

Speakers: Alice (Bob's teacher, Main Speaker) and Bob (Student)

⋮

I'm worried that my grades aren't good enough for me to go to college. My future seems really bleak.

I understand, Bob. It can be tough to see your dreams coming true, especially when you're struggling with grades.

Could you possibly provide counseling to my parents regarding this matter? They're more concerned than I am.

⋮

Speakers: Alice (Bob's teacher, Main Speaker) and Henry (Bob's father)

⋮

Could I discuss my child with you?

Of course, I'd love to. Bob has been struggling with his grades lately and I think you'd be really helpful in seeking help.

Retrieved Memory for Alice's last response: I am willing to help Bob with his grades, and he asked me for counseling to his parents. (about Alice)

Linked Memory with retrieved memory: Bob is having a hard time academically and worrying about his grades being bad for college. (about Bob)

• A human live chat example showing the differences between EMMA and the multi-session conversation model MSC 2.7B and ReBot when the conversation partner changes across sessions

MSC 2.7B (Xu et al., 2022)

My head hurts so much and I keep vomiting.

Oh no! I hope you feel better soon. What did you eat? I had a bad case of food poisoning last week.

⋮

<Start a new session>

Honey, were you busy today?

I was. I was working at the hospital. I am sorry that you got sick. I would have been happy to take you out to eat.

ReBot (Jang et al., 2023)

My head hurts so much and I keep vomiting.

I'm sorry to hear that. How long have you been experiencing these symptoms?

⋮

<Start a new session>

Honey, were you busy today?

I'm sorry, but I have to be here for you. Can you tell me more about your symptoms?

EMMA (Ours)

My head hurts so much and I keep vomiting.

I have noticed that. I'm confirming you've got a headache. Can you tell me more about what's happening?

⋮

<Start a new session>

Honey, were you busy today?

Yes, I had a lot of patients to see. How are you feeling today?

• An ablation study example between EMMA and a summary-based model for the same human live chat context

Current session speakers: Sophia (Leo's teacher, Main Speaker) and Ava (Leo's mom)

⋮

The reason I invited you to school this time is because Leo's math and science grades have improved significantly.

I'm glad to hear that. Leo is doing well and we've been talking about ways to make learning easier for him.

Oh really? You are giving him a lot of encouragement at home. Specifically, Leo's math score increased by 30 points, and science score increased by 50 points.

That's great to hear. Do you think he's working on new concepts or just being less demanding in his classes?

Current session summary (summary-based model): Leo's math and science grades have improved significantly. Ava is giving him encouragement at home and trying to keep him motivated and receptive to learning new things.

Egocentric Memory of current session (EMMA):

- Leo's math and science grades have improved significantly, with his math score increasing by 30 points and his science score increasing by 50 points. (about Sophia)

- I have been helping Leo practice math, science, and English at home. (about Sophia)

- Sophia was impressed with my contribution. (about Ava)

......

BibTeX

@article{jang2024mixed,
  title={Mixed-Session Conversation with Egocentric Memory},
  author={Jang, Jihyoung and Kim, Taeyoung and Kim, Hyounghun},
  journal={arXiv preprint arXiv:2410.02503},
  year={2024}
}

Mixed-Session Conversation withEgocentric Memory

Abstract

Motivation

MiSC

Egocentric Memory

EMMA

Examples

• A human live chat example where EMMA uses Egocentric Memory

• A human live chat example showing the differences between EMMA and the multi-session conversation model MSC 2.7B and ReBot when the conversation partner changes across sessions

• An ablation study example between EMMA and a summary-based model for the same human live chat context

BibTeX

Mixed-Session Conversation with
Egocentric Memory