Text Classification Using Gemini and LangChain.dart

Techniques for Text Classification and Sentiment Analysis with Google's Gemini and LangChain.dart

·

20 min read

Text Classification Using Gemini and LangChain.dart

Introduction

Hey there, fellow Flutter and Dart devs!

If you’ve ever been tasked with classifying data specifically text data automatically you must know how much of a pain it can be, especially as Flutter devs. Since most of us only know Dart, we have to wrangle with Python and JS frameworks to get what we want even with new AI models popping up.

We now have LangChain.dart and to be honest it feels good to work with. You don’t need to learn how to use the many different models to get things done. You also don’t need to learn any new JS framework, Python etc to implement text classification.

In this walkthrough, I'll show you how to go from “This classification can be challenging to implement” to “Wow, that was surprisingly easy” using this port of Python's LangChain framework. Sure, Gemini will be doing most of the heavy lifting but it’ll be LangChain that you’ll come to love. Especially how it abstracts away the things you don’t want to work with.

First, we need somewhere to work.


Creating a command line application

  1. Open your favourite text editor

  2. Change your directory to your projects directory

  3. Run dart create sentiment_analysis

  4. You should see this

Creating sentiment_analysis using template console...

  .gitignore
  analysis_options.yaml
  CHANGELOG.md
  pubspec.yaml
  README.md
  bin/sentiment_analysis.dart
  lib/sentiment_analysis.dart
  test/sentiment_analysis_test.dart

Running pub get...                     3.5s
  Resolving dependencies...
  Downloading packages...
  Changed 50 dependencies!
  5 packages have newer versions incompatible with dependency constraints.
  Try `dart pub outdated` for more information.

Created project sentiment_analysis in sentiment_analysis! In order to get started, run the following commands:

  cd sentiment_analysis
  dart run
  1. We’re going to be working in the lib directory so feel free to remove any code in the bin directory.
rm -r bin
cd lib
  1. Now add the necessary dependencies to your project’s pubspec.yaml file
dependencies:
  langchain: {version}
  langchain_google: {version}

You can also do this using the CLI:

dart pub add langchain
dart pub add langchain_google

We also want to use Gemini's APIs, which is why we added the LangChain.dart Google package.


Obtaining Your API Key

You'll need to grab an API key from Google AI Studio. Here's the step-by-step:

  1. Head to https://aistudio.google.com/app/apikey

  2. Click "Create API Key" if you don’t have one.

  3. Copy your newly generated key or the existing one.

The Gemini API has a free tier for testing purposes so don’t worry about cost for now.

I do not hardcode my API key directly in my source code and I hope you’re not. There’re many ways of keeping this safe and I’m not going to delve into that so for now let’s just export them as Environment Variables for Quick Testing.

Mac/Linux:

export GOOGLEAI_API_KEY='your_actual_api_key_here'

Windows (PowerShell):

$env:GOOGLEAI_API_KEY='your_actual_api_key_here'

Replace `your_actual_api_key_here` with your actual key from Google AI Studio and ensure you’re in the sentiment_analysis directory when you run the commands above.

Verifying the Setup

Verify your environment variable is set correctly:

macOS/Linux:

echo $GOOGLEAI_API_KEY

Windows PowerShell:

$env:GOOGLEAI_API_KEY

Let’s dive into the code in sentiment_analysis.dart

void main() async {
// Place snippets here
}

Choosing Your Gemini Model

The ChatGoogleGenerativeAI supports multiple Gemini models. Let’s go with the latest gemini-1.5-flash

final chatModel = ChatGoogleGenerativeAI(
  apiKey: Platform.environment['GOOGLEAI_API_KEY']!,
  defaultOptions: const ChatGoogleGenerativeAIOptions(
    model: "gemini-1.5-flash",  // Choose your model
    temperature: 0,  // Control randomness
  ),
);
💡
We’re using lower temperatures (0-0.2) to make the responses deterministic. You’re free to increase it up to 1.0 to get responses that are more varied and creative for each example.

Setting Up the Sentiment Analysis Pipeline

Let's dive into crafting our sentiment analysis tool. The real power here is in the prompt template and the chain we'll set up. Although we haven’t fine-tuned our model to perform text classification, most language models are smart (Gemini included) and can figure out what' we’re trying to do although they haven’t been trained specifically for the task.

final promptTemplate = PromptTemplate.fromTemplate(
  '''
  Analyze the sentiment of the text below.
  Respond only with one word to describe the sentiment.

  INPUT: I absolutely adore sunny days!
  OUTPUT: POSITIVE

  INPUT: The sky is blue, and clouds are white.
  OUTPUT: NEUTRAL

  INPUT: I can't believe they canceled the show; it's so frustrating!
  OUTPUT: NEGATIVE

  INPUT: {text}
  OUTPUT:
  ''',
);

Let's break down what's happening in this prompt template. We're basically giving the AI a few examples of how we want it to classify sentiment. It’s like teaching a toddler how you want them to group their toys. Like similar colours or shapes go here instead of everywhere :)

A prompt template is essentially a text string that can take in a set of parameters from the end user and generate a prompt. The prompt template may contain:

  • Instructions to the language model.

  • A set of few-shot examples to help the language model generate a better response.

  • A question or task for the language model.

For those not familiar with triple quotes in Dart, they are used to define multi-line strings. This way we can format the string in the prompt template without needing to use escape characters like \n. These makes it much easier to read and maintain the template because you’re going to be doing that a lot. Especially if you want the Gemini to produce an output exactly the way you want doing actual development/testing.

The {text} is a placeholder that will be replaced with actual input text when the template is used. Now, let's set up the chain that'll make this happen:

final chain = promptTemplate.pipe(chatModel).pipe(const StringOutputParser());
💡
The StringOutputParser is a class in the langchain_dart library that takes the output of the previous Runnable in the chain and converts it into a String. This parser is useful when you need to ensure that the output from various types of inputs is consistently formatted as a string.

Since each step in the pipeline can transform the data in some way. The StringOutputParser is typically used at the end of such a pipeline to ensure that the final output is a string, regardless of the type of data produced by the previous steps.

You don’t need to use it each time but it makes it easier to display or log the results.

The one-liner (LECL) above is all you need. It's like a pipeline that:

  1. Takes our prompt template

  2. Runs it through the Gemini model

  3. Converts the output to a simple string

There’s an alternative syntax you could use instead in case you prefer the slightly less verbose approach.

final chain = promptTemplate | chatModel | const StringOutputParser();

The .pipe() method (or | operator) is similar to a unix pipe operator, which chains together the different components feeds the output from one component as input into the next component.

Let's see it in action:

Run the sentiment_analysis.dart file:

dart run sentiment_analysis.dart
final sentiment = await chain.invoke({
  "text": "The food here is absolutely delicious!"
});

print(sentiment); // Outputs: POSITIVE

That’s it, we've got our sentiment classification. This is also known as Few-shot prompting.

Few-shot prompting is a technique used to improve the performance of language models by providing them with a small set of examples, also known as “demonstrations” or “few-shot examples,” within the prompt itself.

Note that if you constructed your chain like this:

final chain = promptTemplate | chatModel | const StringOutputParser();

The sentiment variable will be of type Object instead of String unlike this:

final chain = promptTemplate.pipe(chatModel).pipe(const StringOutputParser());

We are parsing the generated output with StringOutputParser so using the pipe operator is not going to mess up the output produced even if Dart analysis doesn’t agree. You can confirm that with this:

print(sentiment.runtimeType); // Outputs: String

Here's how we can modify the previous explanation to include this alternative formatting approach:

final res = promptTemplate
      .format({"text": " The food here is absolutely delicious!"});

The format() method allows you to explicitly prepare the prompt. This gives you more granular control over the prompt creation process. Also, you don’t have to lay pipes this way.

We can’t pass a raw string to invoke so we wrap it in PromptValue.string()

final sentiment = await chatModel.invoke(PromptValue.string(res));

Both methods achieve the same end result of sentiment classification

print(sentiment); // Outputs: POSITIVE

Here’s the complete code sample:

import "dart:io";
import "package:langchain/langchain.dart";
import "package:langchain_google/langchain_google.dart";

void main() async {
  final chatModel = ChatGoogleGenerativeAI(
    apiKey: Platform.environment["GOOGLEAI_API_KEY"],
    defaultOptions: const ChatGoogleGenerativeAIOptions(
      model: "gemini-1.5-flash",
      temperature: 0,
    ),
  );

  final promptTemplate = PromptTemplate.fromTemplate(
    '''
    Analyze the sentiment of the text below.
    Respond only with one word to describe the sentiment.

    INPUT: I absolutely adore sunny days!
    OUTPUT: POSITIVE

    INPUT: The sky is blue, and clouds are white.
    OUTPUT: NEUTRAL

    INPUT: I can't believe they canceled the show; it's so frustrating!
    OUTPUT: NEGATIVE

    INPUT: {text}
    OUTPUT:
    ''',
  );

  // final chain = promptTemplate | chatModel | const StringOutputParser();

  final chain = promptTemplate.pipe(chatModel).pipe(const StringOutputParser());

  final  sentiment =
      await chain.invoke({"text": "The food here is absolutely delicious!"});

  // final res = promptTemplate
  //     .format({"text": " The food here is absolutely delicious!"});

  // final sentiment = await chatModel.invoke(PromptValue.string(res));

  print(sentiment); // Outputs: POSITIVE
}

Faking a conversation chain

With LangChain.dart, you can actually fake a conversation chain by defining different templates for different roles.

  • HumanChatMessage: A ChatMessage coming from a human/user.

  • AIChatMessage: A ChatMessage coming from an AI/assistant.

  • SystemChatMessage: A ChatMessage coming from the system.

  • FunctionChatMessage / ToolChatMessage: A ChatMessage containing the output of a function or tool call.

Once Gemini sees the style already developed in the conversation it’ll try to follow the same one and would reply with just the sentiment value we’re looking for.

Create a new file for this since it’s a bit different.

touch fake_conversation.dart

Let’s use the same model setup:

void main() async {
  final chatModel = ChatGoogleGenerativeAI(
    apiKey: Platform.environment["GOOGLEAI_API_KEY"],
    defaultOptions: const ChatGoogleGenerativeAIOptions(
      model: "gemini-1.5-pro",
      temperature: 0,
    ),
  );
...

Here's how we're setting up our new sentiment classification pipeline:

// Define the system message template
const systemTemplate = '''
Analyze the sentiment of the text below.
Respond only with one word to describe the sentiment.
''';

  final fewShotPrompts = ChatPromptTemplate.fromPromptMessages([
    SystemChatMessagePromptTemplate.fromTemplate(systemTemplate),

     // Positive example
    HumanChatMessagePromptTemplate.fromTemplate('I am so happy today!'),
    AIChatMessagePromptTemplate.fromTemplate('POSITIVE'),

    // Neutral example
    HumanChatMessagePromptTemplate.fromTemplate('The sky is blue.'),
    AIChatMessagePromptTemplate.fromTemplate('NEUTRAL'),

    // Negative example
    HumanChatMessagePromptTemplate.fromTemplate(
        "I am very disappointed with the service."),
    AIChatMessagePromptTemplate.fromTemplate('NEGATIVE'),


    HumanChatMessagePromptTemplate.fromTemplate('I enjoy reading books.'),
  ]);

Let's break down what's happening:

  • System Message: We start with a clear instruction about what we want , a one-word sentiment description.

  • Few-Shot Examples: We're basically showing Gemini a mini-training set:

    • A positive sentiment example

    • A neutral sentiment example

    • A negative sentiment example

Invoking the model looks like this:

final aiChatMessage = await chatModel.call(fewShotPrompts.formatMessages());

We’re using .call since we need to pass in a list of messages but we also need to get the messages in the form of List<ChatMessage> so we call .formatMessages on the prompt.

Which results in this output after running the file:

print(aiChatMessage.content); // Outputs: POSITIVE

The prompt templates can also be constructed in other ways.

final fewShotPrompts = ChatPromptTemplate.fromTemplates([
    (ChatMessageType.system, systemTemplate),
    (ChatMessageType.human, 'I am so happy today!'),
    (ChatMessageType.ai, 'POSITIVE'),
    (ChatMessageType.human, 'The sky is blue.'),
    (ChatMessageType.ai, 'NEUTRAL'),
    (ChatMessageType.human, "I am very disappointed with the service."),
    (ChatMessageType.ai, 'NEGATIVE'),
    (ChatMessageType.human, 'I enjoy reading books.'),
  ]);

The invocation remains the same for both so choose whichever for convenience.

final aiChatMessage = await chatModel.call(fewShotPrompts.formatMessages());

print(aiChatMessage.content); // Outputs: POSITIVE

Here’s the complete code sample:

import 'dart:io';
import 'package:langchain/langchain.dart';
import 'package:langchain_google/langchain_google.dart';

void main() async {
  final chatModel = ChatGoogleGenerativeAI(
    apiKey: Platform.environment["GOOGLEAI_API_KEY"],
    defaultOptions: const ChatGoogleGenerativeAIOptions(
      model: "gemini-1.5-pro",
      temperature: 0,
    ),
  );

  // Define the system message template
  const systemTemplate = '''
  Analyze the sentiment of the text below.
  Respond only with one word to describe the sentiment.
  ''';

  // Define the few-shot examples
  final fewShotPrompts = ChatPromptTemplate.fromPromptMessages([
    SystemChatMessagePromptTemplate.fromTemplate(systemTemplate),
    HumanChatMessagePromptTemplate.fromTemplate('I am so happy today!'),
    AIChatMessagePromptTemplate.fromTemplate('POSITIVE'),
    HumanChatMessagePromptTemplate.fromTemplate('The sky is blue.'),
    AIChatMessagePromptTemplate.fromTemplate('NEUTRAL'),
    HumanChatMessagePromptTemplate.fromTemplate(
        "I am very disappointed with the service."),
    AIChatMessagePromptTemplate.fromTemplate('NEGATIVE'),
    HumanChatMessagePromptTemplate.fromTemplate('I enjoy reading books.'),
  ]);

  // final fewShotPrompts = ChatPromptTemplate.fromTemplates([
  //   (ChatMessageType.system, systemTemplate),
  //   (ChatMessageType.human, 'I am so happy today!'),
  //   (ChatMessageType.ai, 'POSITIVE'),
  //   (ChatMessageType.human, 'The sky is blue.'),
  //   (ChatMessageType.ai, 'NEUTRAL'),
  //   (ChatMessageType.human, "I am very disappointed with the service."),
  //   (ChatMessageType.ai, 'NEGATIVE'),
  //   (ChatMessageType.human, 'I enjoy reading books.'),
  // ]);

  final aiChatMessage = await chatModel.call(fewShotPrompts.formatMessages());

  print(aiChatMessage.content); // Outputs: POSITIVE
}

What if we just ask the model?

I’m sure you’re not totally convinced that we needed to do all that in order to get the sentiment and you’re right to be. Anyone can simply paste the texts in ChatGPT and get the same results. We probably did not need to give it examples or fake a conversation chain.

So how do we do things better since we’re Flutter/Dart developers and we definitely want precise, typed, structured outputs?

Enter prompting with a JSON schema, a technique that lets you get precise, structured output directly from the language model. This schema follows specification defined here: https://json-schema.org/

We do not want to lose our progress so far so let’s move to a new file.

touch zero_shot.dart

You can name the above differently if you want.

Now, let’s define an enum for clear sentiment categorization

enum Sentiment { positive, neutral, negative }

void main() async {
...
}

Now, we enhance our model configuration, mainly the ChatGoogleGenerativeAIOptions

final chatModel = ChatGoogleGenerativeAI(
    apiKey: Platform.environment["GOOGLEAI_API_KEY"],
    defaultOptions: ChatGoogleGenerativeAIOptions(
        responseMimeType: "application/json",
        model: "gemini-1.5-flash",
        temperature: 0,
        responseSchema: {
          "type": "object",
          "properties": {
            "sentiment": {
              "type": "string",
              "enum": Sentiment.values.map((e) => e.name).toList()
            }
          },
          "required": ["sentiment"]
        }),
  );

We define our responseSchema and then we pass it to the ChatGoogleGenerativeAIOptions to ensure a Structured Output

Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.

Okay, what’s happening here?

  • responseMimeType: "application/json" tells the model to return JSON
💡
Note: We’re currently limited to application/json.
  • responseSchema defines exactly what we want: a JSON object with a sentiment property

  • We're dynamically creating an enum list from our Sentiment values

The "enum": Sentiment.values.map((e) => e.name).toList() part serves a specific purpose in the JSON schema because when you define an enum in the JSON schema, you're essentially telling Gemini the exact set of allowed values for that field.

In this case, Sentiment.values.map((e) => e.name).toList() converts the Dart enum values to a list of their string names.

This is our enum enum Sentiment { positive, neutral, negative }

If you were to print the values out:

print(Sentiment.values); // Outputs [Sentiment.positive, Sentiment.neutral, Sentiment.negative]

By including this in the schema, you're:

  1. Constraining the model to only return these specific values

  2. Providing clear guidance on what constitutes a valid response

  3. Ensuring type safety by limiting the possible outputs

Without this constraint, Gemini might return random strings that don't match the enum. With the enum constraint, the model is guided to only return "Sentiment.positive", "Sentiment.neutral", or "Sentiment.negative".

This is perfect for this zero-shot prompting example because it gives the model clear boundaries while still allowing it to make an intelligent classification based on the input text. When you do this, even dumb AI models appear decent due to the constraint.

  • required: ["sentiment"] ensures we always get a sentiment back

The prompt has gotten simpler as well:

final promptTemplate = PromptTemplate.fromTemplate(
  '''
  Analyze the sentiment of the text below.
  Respond with a JSON object containing the sentiment.

  Text: {text}
  '''
);

Our chain now includes a JsonOutputParser, since you know, we’re returning a JSON:

final chain = promptTemplate.pipe(chatModel).pipe(JsonOutputParser());

final resultJson = await chain.invoke({
  "text": "The food here is absolutely delicious!"
});

final sentiment = Sentiment.values.byName(resultJson['sentiment']);

print(sentiment); // Outputs: Sentiment.positive

The JsonOutputParser takes the output of the previous Runnable in the chain, converts it to a string, and then parses it as a JSON Map.

Here’s the complete code sample:

import 'dart:io';
import 'package:langchain/langchain.dart';
import 'package:langchain_google/langchain_google.dart';

// Enum to represent sentiment
enum Sentiment { positive, neutral, negative }

void main() async {
  final chatModel = ChatGoogleGenerativeAI(
    apiKey: Platform.environment["GOOGLEAI_API_KEY"],
    defaultOptions: ChatGoogleGenerativeAIOptions(
        responseMimeType: "application/json",
        model: "gemini-1.5-flash",
        temperature: 0,
        responseSchema: {
          "type": "object",
          "properties": {
            "sentiment": {
              "type": "string",
              "enum": Sentiment.values.map((e) => e.name).toList()
            }
          },
          "required": ["sentiment"]
        }),
  );

  final promptTemplate = PromptTemplate.fromTemplate(
    '''
    Analyze the sentiment of the text below.
    Respond with a JSON object containing the sentiment.

    Text: {text}
    ''',
  );

  final chain = promptTemplate.pipe(chatModel).pipe(JsonOutputParser());

  final resultJson =
      await chain.invoke({"text": "The food here is absolutely delicious!"});

  final sentiment = Sentiment.values.byName(resultJson['sentiment']);

  print(sentiment); // Outputs: Sentiment.positive
}

The Embedding Classification Approach

Unlike the prompt-powered approach we’ve used so far, embeddings is a more sophisticated technique that transform the normal text matching problem into a vector space similarity computation.

From LangChain.dart:

Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

So we’re storing semantic meaning in vectors as numbers and we’re going to compare which texts are the closest in meaning in this type of classification. This helps us to generate a more nuanced and contextually aware classification.

In the upcoming example, we're going to be building a book genre classifier that can take a new book description and automatically figure out which genre bucket it belongs in.

We’re not using ChatGoogleGenerativeAI this time although we’re going to be re-using the api key since the Google API Key grants access to text embedding models like text-embedding-004 which is the default for GoogleGenerativeAIEmbeddings in LangChain.dart as of 29th November, 2024.

As we’ve done previously above, create a new file in the lib directory. I’m going to call mine book_classification.dart

touch book_classification.dart

With this at the start:

void main() async {
  final embeddings = GoogleGenerativeAIEmbeddings(
    apiKey: Platform.environment['GOOGLEAI_API_KEY'],
  );
...
}

Document Setup

final documents = [
    Document(
      pageContent: '''
      The Hobbit by J.R.R. Tolkien
      A classic fantasy novel following the journey of Bilbo Baggins as he embarks on a quest to help dwarves reclaim their homeland from a dragon.,

      Harry Potter and the Sorcerer's Stone by J.K. Rowling
      A young wizard discovers his magical heritage and attends Hogwarts School, where he makes friends, uncovers secrets, and battles dark forces.,

      The Name of the Wind by Patrick Rothfuss
      The story of Kvothe, a gifted young man, and his rise from humble beginnings to a legendary figure.
      ''',
      metadata: {'title': 'Fantasy'},
    ),
    Document(
      pageContent: '''
      1984 by George Orwell
      A dystopian novel set in a totalitarian society under constant surveillance, exploring themes of control, truth, and rebellion.,

      Brave New World by Aldous Huxley
      A chilling vision of a future society where individuals are conditioned to conform, and emotions and individuality are suppressed.,

      Fahrenheit 451 by Ray Bradbury
      A story about a fireman in a future society where books are banned and burned to suppress dissenting ideas.
      ''',
      metadata: {'title': 'Dystopian'},
    ),
    Document(
      pageContent: '''
      Pride and Prejudice by Jane Austen
      A romantic novel about Elizabeth Bennet and her evolving relationship with the wealthy Mr. Darcy, set in 19th-century England.,

      The Notebook by Nicholas Sparks
      A tale of enduring love between Noah and Allie, spanning decades and overcoming obstacles.,

      Me Before You by Jojo Moyes
      A story about a young woman who becomes a caregiver for a paralyzed man, and the life-changing relationship they develop.
      ''',
      metadata: {'title': 'Romance'},
    ),
  ];

We're creating a mini library of pre-classified books. Each book comes with a synopsis and a genre metadata tag. We’re adding a metadata tag because:

Google AI support specifying a document title when embedding documents. The title is then used by the model to improve the quality of the embeddings

Embedding Generation

Using GoogleGenerativeAIEmbeddings, we'll transform these book descriptions into numerical vector representations. This helps us capture the semantic meaning in each document.

final documentsEmbeddings = await embeddings.embedDocuments(documents);

Once we get a “new” book that we want to classify, we do they same to it by generating an embedding for it.

final newBook = '''
    **The Fellowship of the Ring by J.R.R. Tolkien**

    This epic novel is the first installment of The Lord of the Rings trilogy. 
    It follows Frodo Baggins as he begins his journey to destroy the One Ring, 
    accompanied by a fellowship of friends and allies.

    Themes:
    * Adventure and camaraderie
    * The battle between good and evil
    * Sacrifice and heroism

    Setting:
    * The rich, fantastical world of Middle-earth, with locations like the Shire, Rivendell, and Moria.

    Characters:
    * Frodo Baggins, Samwise Gamgee, Aragorn, Gandalf, and more.
  ''';

  final newBookEmbedding = await embeddings.embedQuery(newBook);

Now you can find the most similar pre-existing genre by comparing vector similarities.

final mostSimilarIndex =
      getIndexesMostSimilarEmbeddings(newBookEmbedding, documentsEmbeddings)
          .first;

The getIndexesMostSimilarEmbeddings function finds which pre-existing book embeddings are most similar to our new book, using cosine similarity by default.

💡
Cosine Similarity Cosine similarity measures the cosine of the angle between two vectors in a vector space. It ranges from -1 to 1, where: 1 indicates identical vectors. 0 indicates orthogonal (completely dissimilar) vectors. -1 indicates diametrically opposed vectors.

Calling .first on the getIndexesMostSimilarEmbeddings output means we get the index of the most similar document. Now that we have it we can get the category based on the most similar document.

final category = documents[mostSimilarIndex].metadata['title'];

print('The new book belongs to the genre: $category'); // Outputs: Fantasy

In terms of succinctness, you can tell that it’s more to the point than the JSON schema example above. The list of documents are the only things that take up space, and once the embeddings are generated you do not need to do so again for each new book you want to classify. Vector stores like MemoryVectorStore or VertexAIMatchingEngine makes things even easier as you work with larger documents.

Here’s the complete sample:

import 'dart:io';
import 'package:langchain/langchain.dart';
import 'package:langchain_google/langchain_google.dart';

void main() async {
  final embeddings = GoogleGenerativeAIEmbeddings(
    apiKey: Platform.environment['GOOGLEAI_API_KEY'],
  );

  final documents = [
    Document(
      pageContent: '''
      The Hobbit by J.R.R. Tolkien
      A classic fantasy novel following the journey of Bilbo Baggins as he embarks on a quest to help dwarves reclaim their homeland from a dragon.,

      Harry Potter and the Sorcerer's Stone by J.K. Rowling
      A young wizard discovers his magical heritage and attends Hogwarts School, where he makes friends, uncovers secrets, and battles dark forces.,

      The Name of the Wind by Patrick Rothfuss
      The story of Kvothe, a gifted young man, and his rise from humble beginnings to a legendary figure.
      ''',
      metadata: {'title': 'Fantasy'},
    ),
    Document(
      pageContent: '''
      1984 by George Orwell
      A dystopian novel set in a totalitarian society under constant surveillance, exploring themes of control, truth, and rebellion.,

      Brave New World by Aldous Huxley
      A chilling vision of a future society where individuals are conditioned to conform, and emotions and individuality are suppressed.,

      Fahrenheit 451 by Ray Bradbury
      A story about a fireman in a future society where books are banned and burned to suppress dissenting ideas.
      ''',
      metadata: {'title': 'Dystopian'},
    ),
    Document(
      pageContent: '''
      Pride and Prejudice by Jane Austen
      A romantic novel about Elizabeth Bennet and her evolving relationship with the wealthy Mr. Darcy, set in 19th-century England.,

      The Notebook by Nicholas Sparks
      A tale of enduring love between Noah and Allie, spanning decades and overcoming obstacles.,

      Me Before You by Jojo Moyes
      A story about a young woman who becomes a caregiver for a paralyzed man, and the life-changing relationship they develop.
      ''',
      metadata: {'title': 'Romance'},
    ),
  ];

  // Generate embeddings for the existing documents
  final documentsEmbeddings = await embeddings.embedDocuments(documents);

  // New book description to classify
  final newBook = '''
    **The Fellowship of the Ring by J.R.R. Tolkien**

    This epic novel is the first installment of The Lord of the Rings trilogy. 
    It follows Frodo Baggins as he begins his journey to destroy the One Ring, 
    accompanied by a fellowship of friends and allies.

    Themes:
    * Adventure and camaraderie
    * The battle between good and evil
    * Sacrifice and heroism

    Setting:
    * The rich, fantastical world of Middle-earth, with locations like the Shire, Rivendell, and Moria.

    Characters:
    * Frodo Baggins, Samwise Gamgee, Aragorn, Gandalf, and more.
  ''';

  // Generate embedding for the new book
  final newBookEmbedding = await embeddings.embedQuery(newBook);

  // Use LangChain's method to find the most similar embedding
  final mostSimilarIndex =
      getIndexesMostSimilarEmbeddings(newBookEmbedding, documentsEmbeddings)
          .first;

  // Get the category based on the most similar document
  final category = documents[mostSimilarIndex].metadata['title'];

  print('The new book belongs to the genre: $category'); // Outputs: Fantasy
}

Try to create a simple food classification project based on the knowledge you have so far. Some for Appetizer, Main, and Dessert. You can use any language model to generate fake food recipes and then see if you can predict the category for a “new” recipe.


Conclusion

In conclusion, integrating Gemini and LangChain.dart into your apps makes it easier for Flutter and Dart developers to implement text classification. By leveraging these frameworks/tools developer can bypass the complexities of traditional frameworks and models like those found in Python and JavaScript. By using prompt templates, conversation chains, and JSON schema for structured outputs, you get flexibility in handling classification. Additionally, the embedding classification approach further improves our capabilities to categorize text based on semantic similarities.

You can find all the examples here: https://github.com/Nana-Kwame-bot/langchain_examples

David has created an amazing package. Check out the documentation here:

And give the package a like if you enjoy what you've seen:

There’s going to be a few more articles in this LangChain.dart series, subscribe to my newsletter and follow me to get them as soon as I publish.

Did you find this article valuable?

Support Henry's blog by becoming a sponsor. Any amount is appreciated!