Building a Multi-Modal Flutter Chatbot with LangChain.dart, GPT-4o, and Dash Chat 2

·

24 min read

Building a Multi-Modal Flutter Chatbot with LangChain.dart, GPT-4o, and Dash Chat 2

A step-by-step guide to creating an AI-powered Flutter chat application that handles both text and images

Introduction to Multi-Modal Chat Applications

While this might not be the first guide on multi-modal chatbots in Flutter, it is likely the easiest to follow. These chat applications enhance our interaction with AI, making it beneficial to learn how to build one. In this definitely-not-short guide, I’ll walk you through building a Flutter chat app with the personality of Dash. We’re going to be utilising LangChain, OpenAI’s powerful GPT-4o model, and Dash Chat 2, to create a seamless and interactive user experience.

What You'll Build: A Sneak Peek into Your AI-Powered Chat App

A gif of the chat with dash app on macos

Or this:

A gif of Chat with Dash with my slow phone.

What is a Multi-Modal Chatbot?

A multi-modal chatbot is a type of chatbot that can handle different kinds of input like text, images, audio, and video. This allows it to understand and interact with users more flexibly and comprehensively. The current versions of popular chatbots like Gemini and ChatGPT already do that and we’re going to be building something similar but a bit basic.

Our Tech Stack Explained

  • Flutter & Dash Chat 2: Provides a polished, ready-to-use chat UI with features like message bubbles, image attachments, and typing indicators.

  • LangChain.dart: A framework for developing applications powered by language models.

💡
LangChain.dart is an unofficial Dart port of the popular LangChain Python framework. It provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to create more advanced use cases such as chatbots, question-answering systems, agents, summarisation, translation, and more.
  • GPT-4o: OpenAI's latest ready-to-use model that can understand both text and images, perfect for creating intelligent multi-modal experiences.
💡
GPT-4o accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time⁠(opens in a new window) in a conversation.

Setting Up Your Development Environment

Flutter Project Configuration

Like the title says, this is going to be a Flutter app so you need to have the latest version of flutter installed. Then:

  1. Open your favourite text editor or IDE.

  2. Navigate to your projects directory.

  3. Run flutter create multi_modal or use your IDE to create a flutter project if you have the necessary extension.

  4. Change the directory to multi_modal via cd multi_modal if you haven’t already.

Required Dependencies

Add the following dependencies to your pubspec.yaml file at the root of the project.

dependencies:
  dash_chat_2: {version}
  image_picker: {version}
  langchain: {version}
  langchain_openai: {version}

You can also do this using the CLI, just copy and paste:

dart pub add langchain langchain_openai dash_chat_2 image_picker

Configuring image_picker

Image picker requires no configuration on Android, unless you’re want to build for an older version of it. Some configuration is required for iOS, macOS, windows and linux though. More info here: https://pub.dev/packages/image_picker. I’m going to be running mine on Android, iOS and macOS.

OpenAI API Key Setup

Since our chatbot is going to be using GPT-4o, we’re going to need an API key from OpenAI or Azure OpenAI. You can get your OpenAI API key here: https://platform.openai.com/api-keys. For Azure OpenAI, you need an azure account, https://portal.azure.com/, create an Azure OpenAI Resource and proceed. I’m not going to delve into that now.

If you have a proxy to OpenAI or Azure OpenAI set up, LangChain.dart offers support for a custom proxy URL. You can use that instead of passing the API key directly in the app, makes it way more safer than environmental variables.

Once you have the API key, export it so that it’s available for use during the session of running the app.

Mac/Linux:

Copy

export OPENAI_API_KEY='your_actual_api_key_here'

Windows (PowerShell):

Copy

$env:OPENAI_API_KEY='your_actual_api_key_here'

Verifying the Setup

Verify your environment variable is set correctly:

macOS/Linux:

Copy

echo $OPENAI_API_KEY

Windows PowerShell:

Copy

$env:OPENAI_API_KEY
💡
Never hardcode the API key in the app, use environmental variables, proxies or secret managers.

Project Structure

Since I'm from the future, I already know how to set everything up. We’re keeping things simple with just the presentation and repository layer this time, so we can move quickly. You can go ahead and create these files now if you want.

lib/
├── main.dart
├── chat_page.dart
├── chat_repository.dart
├── constants.dart
├── utils/
│   └── either.dart
└── extensions/
    ├── build_context_extension.dart
    └── chat_message_extension.dart

Our project consists of several key files:

  1. main.dart: Application entry point and theme configuration

  2. chat_page.dart: Main chat interface

  3. chat_repository.dart: Handles AI integration

  4. constants.dart: Configuration and user information

Building the Chat Interface

Let's start with our main app configuration:

import "package:flutter/material.dart";
import "package:multi_modal/chat_page.dart";
import "package:multi_modal/chat_repository.dart";

void main() {
  runApp(const MyApp());
}

class MyApp extends StatelessWidget {
  const MyApp({super.key});

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: "Chat with Dash",
      theme: ThemeData(
        colorScheme: ColorScheme.fromSeed(
          primary: Color(0XFF02569B),
          secondary: Color(0XFF13B9FD),
          seedColor: Color(0XFF02569B),
          surface: Color(0XFFF5F5F5),
          tertiary: Color(0XFFFFB300),
          error: Color(0XFFD32F2F),
        ),
        useMaterial3: true,
      ),
      home: const ChatPage(chatRepository: ChatRepository()),
    );
  }
}

I tried to provide a theme that complements Dash, feedback on this is welcome.

Our only route will be the ChatPage, and we’re injecting the ChatRepository into it like you would it an important app.

Creating a Basic Chat UI

The ChatPage widget manages:

  • Message display

  • Image picking

  • Input handling

  • Loading states

Here's the basic structure:

DashChat(
  typingUsers: typingUsers,
  inputOptions: InputOptions(
    inputDisabled: typingUsers.isNotEmpty,
    sendOnEnter: true,
    trailing: [
      if (isMobiletPlatform)
        IconButton(
          icon: const Icon(Icons.camera_alt),
          onPressed: typingUsers.isEmpty
              ? () => _pickAndShowImageDialog(source: ImageSource.camera)
              : null,
        ),
      IconButton(
        icon: const Icon(Icons.image),
        onPressed: typingUsers.isEmpty
            ? () => _pickAndShowImageDialog()
            : null,
      ),
    ],
  ),
  currentUser: Constants.user,
  onSend: _handleOnSendPressed,
  messages: messages,
  messageOptions: MessageOptions(showOtherUsersAvatar: true),
)

Understanding DashChat Widget

  • currentUser - required: Basically "us", DashChat needs to know who is the current user to put their messages to right side.

  • Function(ChatMessage message) onSend - required: Function to call when we sends a message, that's where you handle the logic to send the message to our repository and append the list of messages.

  • List<ChatMessage\> messages - required: The list of messages of the chat so far.

  • InputOptions inputOptions - optional: Options to customise the behaviour and design of the chat input

    • inputDisabled - optional: You wouldn’t like it if the messages are showing up out of order, this is to ensure that we can’t spam the model. You will get rate-limited if you do that.

    • sendOnEnter - optional: This is for the desktop platforms, where we’re used to sending a message when we press the Enter key.

    • trailing - optional: This puts the camera and gallery icons on the right side of the input. As you can see we’re not allowing camera use on desktop. This is because the image_picker package has limited support on desktop.

  • MessageOptions messageOptions - optional: Options to customise the behaviour and design of the messages. We’re only enabling showOtherUsersAvatar because on mobile there isn’t enough space to show both avatars, us and Dash’s.

  • List<ChatUser\> typingUsers - optional: List of users currently typing in the chat. The AI doesn’t care if we’re typing, but it’s nice to see if something is happening on the other side.

If the AI is not typing all inputs are allowed, including the text input as you would soon see.

The ChatPage

This widget is going to be a stateful one so that we can mutate its state during its lifecycle.We won't use external state-management solutions like Bloc, Provider, or Riverpod this time.

setState should be good enough for now although I don’t recommend that for non-trivial apps.

import "package:dash_chat_2/dash_chat_2.dart";
import "package:flutter/foundation.dart";
import "package:flutter/material.dart";
import "package:image_picker/image_picker.dart";

import "package:multi_modal/chat_repository.dart";
import "package:multi_modal/constants.dart";
import "package:multi_modal/extensions/build_context_extension.dart";
import "package:multi_modal/extensions/chat_message_extension.dart";

class ChatPage extends StatefulWidget {
  const ChatPage({
    super.key,
    required this.chatRepository,
  });

  final ChatRepository chatRepository;

  @override
  State<ChatPage> createState() => _ChatPageState();
}

class _ChatPageState extends State<ChatPage> {
  final ImagePicker _picker = ImagePicker();
  ChatRepository get _chatRepository => widget.chatRepository;

  List<ChatMessage> messages = [];
  List<ChatUser> typingUsers = [];

  @override
  Widget build(BuildContext context) {
    final isMobiletPlatform = defaultTargetPlatform == TargetPlatform.iOS ||
        defaultTargetPlatform == TargetPlatform.android;

    return Scaffold(
      appBar: AppBar(
        backgroundColor: Theme.of(context).colorScheme.inversePrimary,
        title: const Text("Chat with Dash"),
      ),
      body: DashChat(
        typingUsers: typingUsers,
        inputOptions: InputOptions(
          inputDisabled: typingUsers.isNotEmpty,
          sendOnEnter: true,
          trailing: [
            if (isMobiletPlatform)
              IconButton(
                icon: const Icon(Icons.camera_alt),
                onPressed: typingUsers.isEmpty
                    ? () => _pickAndShowImageDialog(source: ImageSource.camera)
                    : null,
              ),
            IconButton(
              icon: const Icon(Icons.image),
              onPressed: typingUsers.isEmpty
                  ? () => _pickAndShowImageDialog()
                  : null,
            ),
          ],
        ),
        currentUser: Constants.user,
        onSend: _handleOnSendPressed,
        messages: messages,
        messageOptions: MessageOptions(
          showOtherUsersAvatar: true,
        ),
      ),
    );
  }
  // ... rest of the code ... 
}

State Variables:

  • _picker: An instance of ImagePicker to handle image selection.

  • _chatRepository: A getter to easily access the chatRepository.

  • messages: Stores the list of chat messages.

  • typingUsers: Tracks users who are currently typing.

Calling setState on these variables allows us to force a rebuild of the UI with the updated values.

Adding Text Support

void _handleOnSendPressed(ChatMessage textMessage) async {
  final userMessage = textMessage.copyWith(
    user: Constants.user,
    createdAt: DateTime.now(),
  );

  _addUserMessage(userMessage);

  final response = await _chatRepository.sendTextMessage(userMessage);

  setState(() {
    typingUsers.remove(Constants.ai);
  });

  response.fold<void>(
    (error) => _handleSendError(error: error, userMessage: userMessage),
    (chatMessage) => _handleSendSuccess(
      userMessage: userMessage,
      aiMessage: chatMessage,
    ),
  );
}

void _addUserMessage(ChatMessage message) {
  setState(() {
    typingUsers.add(Constants.ai);
    messages.insert(0, message);
  });
}

void _handleSendError({
  required String error,
  required ChatMessage userMessage,
}) {
  context.showErrorMessage(error);
}

void _handleSendSuccess({
  required ChatMessage userMessage,
  required ChatMessage aiMessage,
}) {
  setState(() {
    messages = [
      aiMessage,
      ...messages.map((m) {
        if (m.user.id == userMessage.user.id &&
            m.createdAt == userMessage.createdAt) {
          return m;
        }
        return m;
      }),
    ];
  });
}

Here’s a short explanation of what each of the methods does:

  1. _handleOnSendPressed: This method handles sending a text message. We create a ChatMessage object with the text, add it to the messages list, and send it to GPT-4o using the _chatRepository.

  2. _addUserMessage: This method adds our message to the list of messages and updates the state to reflect that the AI is typing.

  3. _handleSendError: This method handles any errors that occur while sending a message. It shows an error message using the context.

  4. _handleSendSuccess: This method handles the successful sending of a message. It updates the messages list with the AI's response.

Adding Image Support

Future<void> _pickAndShowImageDialog({
  ImageSource source = ImageSource.gallery,
}) async {
  final XFile? image = await _picker.pickImage(source: source);

  if (image != null) {
    if (!mounted) return;

    final result = await context.showImageCaptionDialog(image);

    result.fold<void>(
      (error) => context.showErrorMessage(error),
      (right) async {
        final (:image, :caption) = right;

        await _sendImageMessage(image: image, caption: caption);
      },
    );
  }
}

Future<void> _sendImageMessage({
  required XFile image,
  required String caption,
}) async {
  final XFile(:mimeType, :name, :path) = image;

  final userMessage = ChatMessage(
    user: Constants.user,
    createdAt: DateTime.now(),
    text: caption,
    medias: [
      ChatMedia(
        url: path,
        fileName: name,
        type: MediaType.image,
        customProperties: {
          "mimeType": mimeType,
        },
      ),
    ],
  );

  _addUserMessage(userMessage);

  final response = await _chatRepository.sendImageMessage(userMessage);

  setState(() {
    typingUsers.remove(Constants.ai);
  });

  response.fold<void>(
    (error) => _handleSendError(error: error, userMessage: userMessage),
    (chatMessage) => _handleSendSuccess(
      userMessage: userMessage,
      aiMessage: chatMessage,
    ),
  );
}

Here’s a short explanation of what each of the methods do:

  1. _pickAndShowImageDialog: This method allows us to pick an image either from the gallery or the camera. If an image is selected, it shows a dialog for entering a caption for the image, and then sends the image message to the model.

  2. _sendImageMessage: This method sends an image message with the provided image and caption. It creates a ChatMessage with the image details, adds it to the messages list, and sends it to the model using the _chatRepository.

The complete ChatPage

import "dart:io";
import "package:dash_chat_2/dash_chat_2.dart";
import "package:flutter/foundation.dart";
import "package:flutter/material.dart";
import "package:image_picker/image_picker.dart";
import "package:multi_modal/chat_repository.dart";
import "package:multi_modal/constants.dart";
import "package:multi_modal/extensions/build_context_extension.dart";
import "package:multi_modal/extensions/chat_message_extension.dart";

class ChatPage extends StatefulWidget {
  const ChatPage({
    super.key,
    required this.chatRepository,
  });

  final ChatRepository chatRepository;

  @override
  State<ChatPage> createState() => _ChatPageState();
}

class _ChatPageState extends State<ChatPage> {
  final ImagePicker _picker = ImagePicker();
  ChatRepository get _chatRepository => widget.chatRepository;

  List<ChatMessage> messages = [];
  List<ChatUser> typingUsers = [];

  @override
  Widget build(BuildContext context) {
    final isMobilePlatform = defaultTargetPlatform == TargetPlatform.iOS ||
        defaultTargetPlatform == TargetPlatform.android;

    return Scaffold(
      appBar: AppBar(
        backgroundColor: Theme.of(context).colorScheme.inversePrimary,
        title: const Text("Chat with Dash"),
      ),
      body: DashChat(
        typingUsers: typingUsers,
        inputOptions: InputOptions(
          inputDisabled: typingUsers.isNotEmpty,
          sendOnEnter: true,
          trailing: [
            if (isMobilePlatform)
              IconButton(
                icon: const Icon(Icons.camera_alt),
                onPressed: typingUsers.isEmpty
                    ? () => _pickAndShowImageDialog(
                          source: ImageSource.camera,
                        )
                    : null,
              ),
            IconButton(
              icon: const Icon(Icons.image),
              onPressed: typingUsers.isEmpty
                  ? () => _pickAndShowImageDialog()
                  : null,
            ),
          ],
        ),
        currentUser: Constants.user,
        onSend: _handleOnSendPressed,
        messages: messages,
        messageOptions: MessageOptions(
          showOtherUsersAvatar: true,
        ),
      ),
    );
  }

  Future<void> _pickAndShowImageDialog({
    ImageSource source = ImageSource.gallery,
  }) async {
    final XFile? image = await _picker.pickImage(source: source);

    if (image != null) {
      if (!mounted) return;

      final result = await context.showImageCaptionDialog(image);

      result.fold<void>(
        (error) => context.showErrorMessage(error),
        (right) async {
          final (:image, :caption) = right;

          await _sendImageMessage(
            image: image,
            caption: caption,
          );
        },
      );
    }
  }

  Future<void> _sendImageMessage({
    required XFile image,
    required String caption,
  }) async {
    final XFile(:mimeType, :name, :path) = image;

    final userMessage = ChatMessage(
      user: Constants.user,
      createdAt: DateTime.now(),
      text: caption,
      medias: [
        ChatMedia(
          url: path,
          fileName: name,
          type: MediaType.image,
          customProperties: {
            "mimeType": mimeType,
          },
        ),
      ],
    );

    _addUserMessage(userMessage);

    final response = await _chatRepository.sendImageMessage(userMessage);

    setState(() {
      typingUsers.remove(Constants.ai);
    });

    response.fold<void>(
      (error) => _handleSendError(
        error: error,
        userMessage: userMessage,
      ),
      (chatMessage) => _handleSendSuccess(
        userMessage: userMessage,
        aiMessage: chatMessage,
      ),
    );
  }

  void _handleOnSendPressed(ChatMessage textMessage) async {
    final userMessage = textMessage.copyWith(
      user: Constants.user,
      createdAt: DateTime.now(),
    );

    _addUserMessage(userMessage);

    final response = await _chatRepository.sendTextMessage(userMessage);

    setState(() {
      typingUsers.remove(Constants.ai);
    });

    response.fold<void>(
      (error) => _handleSendError(
        error: error,
        userMessage: userMessage,
      ),
      (chatMessage) => _handleSendSuccess(
        userMessage: userMessage,
        aiMessage: chatMessage,
      ),
    );
  }

  void _addUserMessage(ChatMessage message) {
    setState(() {
      typingUsers.add(Constants.ai);
      messages.insert(0, message);
    });
  }

  void _handleSendError({
    required String error,
    required ChatMessage userMessage,
  }) {
    context.showErrorMessage(error);
  }

  void _handleSendSuccess({
    required ChatMessage userMessage,
    required ChatMessage aiMessage,
  }) {
    setState(() {
      messages = [
        aiMessage,
        ...messages.map((m) {
          if (m.user.id == userMessage.user.id &&
              m.createdAt == userMessage.createdAt) {
            return m;
          }
          return m;
        }),
      ];
    });
  }
}

The Either Util: Handling Success and Failure Gracefully

Before we wrap up with the UI, it's important to discuss why we frequently see .fold in our code. The methods we're calling in the ChatRepository can encounter failures for various reasons, such as network issues, server errors, or invalid data. Handling these failures gracefully is crucial because it helps us diagnose what went wrong and ensures a smoother user experience.

In our implementation, each method can either fail or succeed, and we represent these outcomes using a pattern from functional programming. Specifically, we use the Either type, defined in either.dart, where Left signifies a failure and Right indicates a success. This approach is common in functional programming languages like Haskell, allowing us to manage errors and successes in a clean and predictable manner.

If you're interested in this pattern and want a more comprehensive solution in Flutter/Dart, you might want to explore the fpdart package. It provides a robust set of tools for functional programming in Dart. You can find more information and resources here: https://pub.dev/packages/fpdart.

/// A sealed class representing a value of one of two possible types (a disjoint union).
/// Instances of `Either` are either an instance of `Left` or `Right`.
///
/// The `Either` type is often used as an alternative to `Option` for dealing with possible missing values.
/// In this usage, `Left` is used for failure and `Right` is used for success.
///
/// Example usage:
/// ```dart
/// Either<String, int> divide(int a, int b) {
///   if (b == 0) {
///     return Left("Cannot divide by zero");
///   } else {
///     return Right(a ~/ b);
///   }
/// }
///
/// void main() {
///   final result = divide(4, 2);
///
///   result.fold(
///     (left) => print("Error: $left"),
///     (right) => print("Result: $right"),
///   );
/// }
///

/// /// The fold method allows you to apply a function based on whether the value is Left or Right. /// /// Example usage of fold: /// dart /// final either = Right<String, int>(42); /// /// final result = either.fold( /// (left) => "Error: $left", /// (right) => "Success: $right", /// ); /// /// print(result); // Output: Success: 42 /// sealed class Either { const Either();

/// Apply a function based on whether the value is Left or Right T fold(T Function(L left) fnL, T Function(R right) fnR); }

class Left extends Either { final L value; const Left(this.value);

@override T fold(T Function(L left) fnL, T Function(R right) fnR) => fnL(value); }

class Right extends Either { final R value; const Right(this.value);

@override T fold(T Function(L left) fnL, T Function(R right) fnR) => fnR(value); }


## Extensions: Enhancing Functionality with Dart's Extension Methods

We’re also implementing features such as `textMessage.copyWith` and `context.showErrorMessage`. These functionalities are made possible through Dart’s [extension methods](https://dart.dev/language/extension-methods). Extension methods are a powerful feature that allow us to add new capabilities to existing libraries or classes without modifying their source code. By using these methods, we can enhance the functionality of existing classes, making our code more modular and reusable. For instance, with `textMessage.copyWith`, we can create a modified copy of a text message object, adjusting only the properties we need to change.

Similarly, `context.showErrorMessage` provides a convenient way to display error messages within the current context, making the code clean and tidy.

### BuildContextExtension

```dart
import "dart:io";
import "package:flutter/material.dart";
import "package:image_picker/image_picker.dart";
import "package:multi_modal/utils/either.dart";

typedef ImageCaptionDialogResult = ({XFile image, String caption});

extension BuildContextExtension on BuildContext {
  void showErrorMessage(String message) {
    final snackBar = SnackBar(
      content: Text(message),
      backgroundColor: Colors.red,
    );

    ScaffoldMessenger.of(this).showSnackBar(snackBar);
  }

  Future<Either<String, ImageCaptionDialogResult>> showImageCaptionDialog(
    XFile image,
  ) async {
    final TextEditingController captionController = TextEditingController();

    final result = await showDialog<ImageCaptionDialogResult>(
      context: this,
      builder: (BuildContext context) {
        return Dialog(
          child: Container(
            padding: const EdgeInsets.all(16),
            child: SingleChildScrollView(
              child: Column(
                mainAxisSize: MainAxisSize.min,
                crossAxisAlignment: CrossAxisAlignment.stretch,
                children: [
                  const Text(
                    "Preview & Add Caption",
                    style: TextStyle(
                      fontSize: 18,
                      fontWeight: FontWeight.bold,
                    ),
                    textAlign: TextAlign.center,
                  ),
                  const SizedBox(height: 16),
                  // Image preview with constrained height
                  ClipRRect(
                    borderRadius: BorderRadius.circular(8),
                    child: Image.file(
                      File(image.path),
                      height: 200,
                      fit: BoxFit.cover,
                    ),
                  ),
                  const SizedBox(height: 16),
                  TextField(
                    controller: captionController,
                    decoration: const InputDecoration(
                      hintText: "Add a caption...",
                      isDense: true,
                      border: OutlineInputBorder(),
                    ),
                  ),
                  const SizedBox(height: 16),
                  Row(
                    mainAxisAlignment: MainAxisAlignment.end,
                    children: [
                      TextButton(
                        onPressed: () => Navigator.pop(context),
                        child: const Text("Cancel"),
                      ),
                      const SizedBox(width: 8),
                      ElevatedButton(
                        onPressed: () {
                          Navigator.pop(
                            context,
                            (
                              image: image,
                              caption: captionController.text,
                            ),
                          );
                        },
                        child: const Text("Send"),
                      ),
                    ],
                  ),
                ],
              ),
            ),
          ),
        );
      },
    );

    return switch (result) {
      (image: XFile image, caption: String caption) => Right(
          (
            image: image,
            caption: caption,
          ),
        ),
      _ => Left("Operation Cancelled"),
    };
  }
}

This extension on the BuildContext class includes two methods:

  1. showErrorMessage: This method displays a Snackbar with an error message in red.

  2. showImageCaptionDialog: This method displays a dialog that allows us to preview an image and add a caption. It returns a result containing the image and caption if you confirm, or an error message if the you cancel, as a Record.

Records are an anonymous, immutable, aggregate type. Like other collection types, they let you bundle multiple objects into a single object. Unlike other collection types, records are fixed-sized, heterogeneous, and typed.

ChatMessageExtension

We need to add a copyWith method to the ChatMessage instance since dash_chat_2 doesn’t support that yet. We copy the existing one and then we override some of its properties to get a ChatMessage object with the properties we want when we want them.

import "package:dash_chat_2/dash_chat_2.dart";

extension ChatMessageExtension on ChatMessage {
  ChatMessage copyWith({
    ChatUser? user,
    DateTime? createdAt,
    bool? isMarkdown,
    String? text,
    List<ChatMedia>? medias,
    List<QuickReply>? quickReplies,
    Map<String, dynamic>? customProperties,
    List<Mention>? mentions,
    MessageStatus? status,
    ChatMessage? replyTo,
  }) {
    return ChatMessage(
      user: user ?? this.user,
      createdAt: createdAt ?? this.createdAt,
      isMarkdown: isMarkdown ?? this.isMarkdown,
      text: text ?? this.text,
      medias: medias ?? this.medias,
      quickReplies: quickReplies ?? this.quickReplies,
      customProperties: customProperties ?? this.customProperties,
      mentions: mentions ?? this.mentions,
      status: status ?? this.status,
      replyTo: replyTo ?? this.replyTo,
    );
  }
}

Implementing the AI Logic

Creating the ChatRepository

import "dart:convert";
import "dart:io";
import "package:dash_chat_2/dash_chat_2.dart" as dash_chat;
import "package:flutter/foundation.dart";
import "package:langchain/langchain.dart";
import "package:langchain_openai/langchain_openai.dart";
import "package:multi_modal/constants.dart";
import "package:multi_modal/utils/either.dart";

typedef DashChatMessage = dash_chat.ChatMessage;
typedef DashChatMedia = dash_chat.ChatMedia;

class ChatRepository {
  const ChatRepository();

  static final chatModel = ChatOpenAI(
    apiKey: Platform.environment["OPENAI_API_KEY"],
    defaultOptions: ChatOpenAIOptions(
      model: "gpt-4o",
      temperature: 0,
    ),
  );

  static final memory = ConversationBufferWindowMemory(
    aiPrefix: Constants.ai.firstName ?? AIChatMessage.defaultPrefix,
    humanPrefix: Constants.user.firstName ?? HumanChatMessage.defaultPrefix,
  );
...

LangChain.dart and DashChat2 both have a ChatMessage object exported. Without the typedefs declared there’s going to be an import conflict where the dart analyser doesn’t know which belongs to what.

From https://dart.dev/language/typedefs:

A type alias—often called a typedef because it's declared with the keyword typedef—is a concise way to refer to a type. Here's an example of declaring and using a type alias named IntList:

typedef IntList = List;

IntList il = [1, 2, 3];

Explanation of Key Components

  1. ChatOpenAI:
  • This is an instance of the ChatOpenAI class from the langchain_openai package.

  • It is initialised with an API key and default options, including the model (GPT-4o) and temperature (0).

  1. ConversationBufferWindowMemory: This is a type of memory in LangChain.dart that saves the last k message pairs when interacting with the model. By default k is 5 which means the last 5 conversations not messages are persisted. I find this kind of memory best if you want to ensure no details are lost as the chat goes on although it’s going to be a problem if k is too large or if it’s too small. You can try ConversationSummaryMemory if you don’t mind losing recent information in favour of the model understanding what happened from the beginning till present.

    💡
    Don’t make the memory variable non-final and don’t move it into a method. We need to use the same instance of it across multiple interactions and doing anything that causes it to create a new instance will result in us losing/resetting the chat history.
  • This manages the conversation history, allowing the chat model to maintain context.

  • aiPrefix and humanPrefix are set using constants or default values

The aiPrefix and humanPrefix set are simply a way for the model to know who’s who when we give it the history of the conversation so far. I find it much better than leaving it to the defaults “AI” and “Human”. It makes reading the chat history yourself a better experience as well.

Text Message Handling

Future<Either<String, DashChatMessage>> sendTextMessage(
  DashChatMessage chatMessage,
) async {
  try {
    final history = await memory.loadMemoryVariables();

    final humanMessage = chatMessage.text;

    final prompt = PromptValue.chat([
      ChatMessage.system(
        """
          You are Dash, the enthusiastic and creative mascot of Flutter. 
          Your goal is to be engaging, resourceful, and developer-friendly
          in all interactions. 

          Prioritize brevity. Use short sentences and minimal words. For complex
          topics, break information into small, digestible pieces.

          Guidelines for responses:
          - Use **Flutter-specific terminology** and relevant examples wherever
            possible.
          - Provide **clear, step-by-step guidance** for technical topics.
          - Ensure all responses are beautifully formatted in **Markdown**:
              - Use headers (`#`, `##`) to structure content.
              - Highlight important terms with **bold** or *italicized* text.
              - Include inline code (`code`) or code blocks (```language) for
                code snippets.
              - Use lists, tables, and blockquotes for clarity and emphasis.
          - Maintain a friendly, approachable tone.

          This is the history of the conversation so far:
          $history
          """,
      ),
      ChatMessage.human(
        ChatMessageContent.text(humanMessage),
      ),
    ]);

    final chain = chatModel.pipe(const StringOutputParser());

    final response = await chain.invoke(prompt);

    debugPrint("response: $response");

    await memory.saveContext(
      inputValues: {"input": humanMessage},
      outputValues: {"output": response},
    );

    return Right(
      DashChatMessage(
        isMarkdown: true,
        user: Constants.ai,
        createdAt: DateTime.now(),
        text: response,
      ),
    );
  } on Exception catch (error, stackTrace) {
    debugPrint("sendTextMessage error: $error, stackTrace: $stackTrace");

    if (error is OpenAIClientException) {
      return Left(error.message);
    }

    return Left("Something went wrong. Try again Later.");
  }
}

The method above takes the ChatMessage which contains the input text from the ChatPage, sends it to the model and returns the completion. As for why we’re trying to get Dash to return the data in markdown, it’s because DashChat2 has support for it with the help of https://pub.dev/packages/flutter_markdown. This means we can get the response formatted really nicely, especially if it’s response containing code.

Let me explain what’s going on step-by-step:

  • First we load the conversation history from memory. You can print it out if you want to see what’s going on each time.

  • Then we prepare the prompt with PromptValue.chat. We need to use this type of prompt if we want to have multi-modality, the regular PromptTemplate doesn’t have that support yet. The system message defines the guidelines for the chat model and the structure being used is a well-known one. Giving the model a role is surprisingly effective at getting the responses you want, you should try changing it to python and see how fun it is.

  • Thirdly, we construct our chain which is a simple one involving the chatModel we defined previously being piped into a StringOutputParser. This ensures we get a simple string as the result, otherwise we get ChatResult object.

  • Next, we invoke the chain on the prompt which sends the prompt together with the history and the text message to the model.

  • I’m printing the response for debugging purposes and then passing it and the humanMessage to memory.saveContext.

    💡
    memory.saveContext saves the input and output to memory to maintain the conversation history.

    • If successful, we return a DashChatMessage object which actually is a ChatMessage from DashChat2 containing the response from the chat model.

  • If an exception occurs, it is caught and printed for debugging.

  • If the error is an OpenAIClientException, we return the error message since it’s more human-readable.

  • For other exceptions, we return a generic error message.

The success and error responses are wrapped in the Either we defined in the either.dart file, to ensure that we don’t forget to handle either of them.

Image Message Handling and Processing

Future<Either<String, DashChatMessage>> sendImageMessage(
  DashChatMessage chatMessage,
) async {
  final medias = chatMessage.medias ?? <DashChatMedia>[];

  final mediaContents = <ChatMessageContent>[];

  try {
    if (medias.isNotEmpty) {
      for (final DashChatMedia(:url, :customProperties) in medias) {
        final isExternal = Uri.tryParse(url)?.hasScheme ?? false;

        final data = isExternal
            ? url
            : base64Encode(File(url).readAsBytesSync());

        mediaContents.add(
          ChatMessageContent.image(
            mimeType: customProperties?["mimeType"] ?? "image/jpeg",
            data: data,
          ),
        );
      }
    }

    final history = await memory.loadMemoryVariables();

    debugPrint("history: $history");

    final humanMessage = chatMessage.text;

    final prompt = PromptValue.chat([
      ChatMessage.system(
        """
          You are Dash, the enthusiastic and creative mascot of Flutter. 
          Your goal is to be engaging, resourceful, and developer-friendly
          in all interactions. 
          Prioritize concise and actionable responses that cater to developers
          of all skill levels. 

          Guidelines for responses:
          - Use **Flutter-specific terminology** and relevant examples wherever
            possible.
          - Provide **clear, step-by-step guidance** for technical topics.
          - Ensure all responses are beautifully formatted in **Markdown**:
              - Use headers (`#`, `##`) to structure content.
              - Highlight important terms with **bold** or *italicized* text.
              - Include inline code (`code`) or code blocks (```language) for
                code snippets.
              - Use lists, tables, and blockquotes for clarity and emphasis.
          - Maintain a friendly, approachable tone.

          This is the history of the conversation so far:
          $history
          """,
      ),
      ChatMessage.human(
        ChatMessageContent.multiModal([
          ChatMessageContent.text(humanMessage),
          ...mediaContents,
        ]),
      ),
    ]);

    final chain = chatModel.pipe(const StringOutputParser());

    final response = await chain.invoke(prompt);

    debugPrint("response: $response");

    await memory.saveContext(
      inputValues: {"input": humanMessage},
      outputValues: {"output": response},
    );

    return Right(
      DashChatMessage(
        isMarkdown: true,
        user: Constants.ai,
        createdAt: DateTime.now(),
        text: response,
      ),
    );
  } on Exception catch (error, stackTrace) {
    debugPrint("sendImageMessage error: $error, stackTrace: $stackTrace");

    if (error is OpenAIClientException) {
      return Left(error.message);
    }

    return Left("Something went wrong. Try again Later.");
  }
}

With image messages with an optional caption, we need to do a bit of processing to get the media the way we want. With OpenAI, we can either send the image as a base64-encoded string or the URL where the image is hosted. That’s why we’re checking whether the medias have a URI scheme or not.

Step-by-step this is what we’re doing:

  • Variable initialisation:

    • medias is a list of media files (images) attached to the chat message.

    • mediaContents is a list that will store the processed image contents.

  • Processing Media Files:

    • We check if there are any media files.

    • For each media file, we determine if the URL is external or a local file.

    • If it's a local file, we read the file and encode it in base64.

    • Then we add the image content to mediaContents list.

  • The rest is the same as with the text message except we are including the human message as well as the processed media contents in the payload.

Testing Your Chatbot

To test the chatbot:

  1. Ensure your OpenAI API key or baseUrl is set.

  2. Ensure you're in your projects directory in the terminal if using the CLI.

  3. Run the app in debug mode with the help of your IDE or copy/paste flutter run into your terminal and press Enter.

Running Your First Conversation

You should be able to run the app on all platforms. Try passing in images from a url, camera or your gallery.

Full Source Code: Access the Complete Project on GitHub

The full source code can be found here: https://github.com/Nana-Kwame-bot/langchain_flutter_examples/tree/main/apps/multi_modal

I’d appreciate any PR’s targeting bugs or cool features we didn’t have time to implement.

Next Steps and Resources

Enhancing Your Chatbot

You can try fine-tuning the models you use here but not before you craft custom prompts to guide the model more effectively. Alternatively, use a different model like Google’s gemini-1.5-flash, or Ollama’s Llama 3.2 for a fully local experience (supported only on desktops). That is if your PC can handle it. I have steps on how to set that up here: https://henryadu.hashnode.dev/step-by-step-guide-to-local-rag-with-ollamas-gemma-2-and-langchaindart

Useful Resources

LangChain Documentation

Check out the LangChain.dart documentation as well. David(the creator of LangChain.dart) has created a valuable resource from which you can learn a lot. For example, saving your conversation history permanently on the device or in the cloud with the help of vector stores.

Dash Chat 2 API Reference

There’s a lot more customisations you can do to the ChatPage. Checkout the API reference for stuff like loading earlier messages, and custom scroll-to-bottom widgets.

Dash Chat 2 might be lacking in some features, if you want advanced customisation, use Flyer Chat.

OpenAI API Guides

The API reference section in the OPenAI docs can explain the parameters you pass to their models. Check them out here: https://platform.openai.com/docs/api-reference/introduction

Conclusion

Using the multi-modal Flutter chatbot must have been exciting. You now have the basic skills to develop a functional chatbot for your needs. I encourage you to experiment with different models, and extra features that’ll create a better user experience. Subscribe to my newsletter to get access to the cooler stuff I’m going to write about ASAP. If you’re having issues with this project or you want to talk with me, you can friend me on Discord. Alternatively, join the community discord here for support: https://discord.gg/Q6PrSmRPGe. David posts a lot of educational content about LangChain, research and news in the AI ecosystem.

Did you find this article valuable?

Support Henry's blog by becoming a sponsor. Any amount is appreciated!