Text Classification Using Gemini and LangChain.dart
Techniques for Text Classification and Sentiment Analysis with Google's Gemini and LangChain.dart
Table of contents
Introduction
Hey there, fellow Flutter and Dart devs!
If you’ve ever been tasked with classifying data specifically text data automatically you must know how much of a pain it can be, especially as Flutter devs. Since most of us only know Dart, we have to wrangle with Python and JS frameworks to get what we want even with new AI models popping up.
We now have LangChain.dart and to be honest it feels good to work with. You don’t need to learn how to use the many different models to get things done. You also don’t need to learn any new JS framework, Python etc to implement text classification.
In this walkthrough, I'll show you how to go from “This classification can be challenging to implement” to “Wow, that was surprisingly easy” using this port of Python's LangChain framework. Sure, Gemini will be doing most of the heavy lifting but it’ll be LangChain that you’ll come to love. Especially how it abstracts away the things you don’t want to work with.
First, we need somewhere to work.
Creating a command line application
Open your favourite text editor
Change your directory to your projects directory
Run
dart create sentiment_analysis
You should see this
Creating sentiment_analysis using template console...
.gitignore
analysis_options.yaml
CHANGELOG.md
pubspec.yaml
README.md
bin/sentiment_analysis.dart
lib/sentiment_analysis.dart
test/sentiment_analysis_test.dart
Running pub get... 3.5s
Resolving dependencies...
Downloading packages...
Changed 50 dependencies!
5 packages have newer versions incompatible with dependency constraints.
Try `dart pub outdated` for more information.
Created project sentiment_analysis in sentiment_analysis! In order to get started, run the following commands:
cd sentiment_analysis
dart run
- We’re going to be working in the
lib
directory so feel free to remove any code in thebin
directory.
rm -r bin
cd lib
- Now add the necessary dependencies to your project’s
pubspec.yaml
file
dependencies:
langchain: {version}
langchain_google: {version}
You can also do this using the CLI:
dart pub add langchain
dart pub add langchain_google
We also want to use Gemini's APIs, which is why we added the LangChain.dart Google package.
Obtaining Your API Key
You'll need to grab an API key from Google AI Studio. Here's the step-by-step:
Click "Create API Key" if you don’t have one.
Copy your newly generated key or the existing one.
The Gemini API has a free tier for testing purposes so don’t worry about cost for now.
I do not hardcode my API key directly in my source code and I hope you’re not. There’re many ways of keeping this safe and I’m not going to delve into that so for now let’s just export them as Environment Variables for Quick Testing.
Mac/Linux:
export GOOGLEAI_API_KEY='your_actual_api_key_here'
Windows (PowerShell):
$env:GOOGLEAI_API_KEY='your_actual_api_key_here'
Replace `your_actual_api_key_here
` with your actual key from Google AI Studio and ensure you’re in the sentiment_analysis
directory when you run the commands above.
Verifying the Setup
Verify your environment variable is set correctly:
macOS/Linux:
echo $GOOGLEAI_API_KEY
Windows PowerShell:
$env:GOOGLEAI_API_KEY
Let’s dive into the code in sentiment_analysis.dart
void main() async {
// Place snippets here
}
Choosing Your Gemini Model
The ChatGoogleGenerativeAI
supports multiple Gemini models. Let’s go with the latest gemini-1.5-flash
final chatModel = ChatGoogleGenerativeAI(
apiKey: Platform.environment['GOOGLEAI_API_KEY']!,
defaultOptions: const ChatGoogleGenerativeAIOptions(
model: "gemini-1.5-flash", // Choose your model
temperature: 0, // Control randomness
),
);
1.0
to get responses that are more varied and creative for each example.Setting Up the Sentiment Analysis Pipeline
Let's dive into crafting our sentiment analysis tool. The real power here is in the prompt template and the chain we'll set up. Although we haven’t fine-tuned our model to perform text classification, most language models are smart (Gemini included) and can figure out what' we’re trying to do although they haven’t been trained specifically for the task.
final promptTemplate = PromptTemplate.fromTemplate(
'''
Analyze the sentiment of the text below.
Respond only with one word to describe the sentiment.
INPUT: I absolutely adore sunny days!
OUTPUT: POSITIVE
INPUT: The sky is blue, and clouds are white.
OUTPUT: NEUTRAL
INPUT: I can't believe they canceled the show; it's so frustrating!
OUTPUT: NEGATIVE
INPUT: {text}
OUTPUT:
''',
);
Let's break down what's happening in this prompt template. We're basically giving the AI a few examples of how we want it to classify sentiment. It’s like teaching a toddler how you want them to group their toys. Like similar colours or shapes go here instead of everywhere :)
A prompt template is essentially a text string that can take in a set of parameters from the end user and generate a prompt. The prompt template may contain:
Instructions to the language model.
A set of few-shot examples to help the language model generate a better response.
A question or task for the language model.
For those not familiar with triple quotes in Dart, they are used to define multi-line strings. This way we can format the string in the prompt template without needing to use escape characters like \n
. These makes it much easier to read and maintain the template because you’re going to be doing that a lot. Especially if you want the Gemini to produce an output exactly the way you want doing actual development/testing.
The {text}
is a placeholder that will be replaced with actual input text when the template is used. Now, let's set up the chain that'll make this happen:
final chain = promptTemplate.pipe(chatModel).pipe(const StringOutputParser());
StringOutputParser
is a class in the langchain_dart
library that takes the output of the previous Runnable
in the chain and converts it into a String
. This parser is useful when you need to ensure that the output from various types of inputs is consistently formatted as a string.Since each step in the pipeline can transform the data in some way. The StringOutputParser
is typically used at the end of such a pipeline to ensure that the final output is a string, regardless of the type of data produced by the previous steps.
You don’t need to use it each time but it makes it easier to display or log the results.
The one-liner (LECL) above is all you need. It's like a pipeline that:
Takes our prompt template
Runs it through the Gemini model
Converts the output to a simple string
There’s an alternative syntax you could use instead in case you prefer the slightly less verbose approach.
final chain = promptTemplate | chatModel | const StringOutputParser();
The
.pipe()
method (or|
operator) is similar to a unix pipe operator, which chains together the different components feeds the output from one component as input into the next component.
Let's see it in action:
Run the sentiment_analysis.dart
file:
dart run sentiment_analysis.dart
final sentiment = await chain.invoke({
"text": "The food here is absolutely delicious!"
});
print(sentiment); // Outputs: POSITIVE
That’s it, we've got our sentiment classification. This is also known as Few-shot prompting.
Few-shot prompting is a technique used to improve the performance of language models by providing them with a small set of examples, also known as “demonstrations” or “few-shot examples,” within the prompt itself.
Note that if you constructed your chain like this:
final chain = promptTemplate | chatModel | const StringOutputParser();
The sentiment
variable will be of type Object instead of String unlike this:
final chain = promptTemplate.pipe(chatModel).pipe(const StringOutputParser());
We are parsing the generated output with StringOutputParser
so using the pipe operator is not going to mess up the output produced even if Dart analysis doesn’t agree. You can confirm that with this:
print(sentiment.runtimeType); // Outputs: String
Here's how we can modify the previous explanation to include this alternative formatting approach:
final res = promptTemplate
.format({"text": " The food here is absolutely delicious!"});
The format()
method allows you to explicitly prepare the prompt. This gives you more granular control over the prompt creation process. Also, you don’t have to lay pipes this way.
We can’t pass a raw string to invoke
so we wrap it in PromptValue.string()
final sentiment = await chatModel.invoke(PromptValue.string(res));
Both methods achieve the same end result of sentiment classification
print(sentiment); // Outputs: POSITIVE
Here’s the complete code sample:
import "dart:io";
import "package:langchain/langchain.dart";
import "package:langchain_google/langchain_google.dart";
void main() async {
final chatModel = ChatGoogleGenerativeAI(
apiKey: Platform.environment["GOOGLEAI_API_KEY"],
defaultOptions: const ChatGoogleGenerativeAIOptions(
model: "gemini-1.5-flash",
temperature: 0,
),
);
final promptTemplate = PromptTemplate.fromTemplate(
'''
Analyze the sentiment of the text below.
Respond only with one word to describe the sentiment.
INPUT: I absolutely adore sunny days!
OUTPUT: POSITIVE
INPUT: The sky is blue, and clouds are white.
OUTPUT: NEUTRAL
INPUT: I can't believe they canceled the show; it's so frustrating!
OUTPUT: NEGATIVE
INPUT: {text}
OUTPUT:
''',
);
// final chain = promptTemplate | chatModel | const StringOutputParser();
final chain = promptTemplate.pipe(chatModel).pipe(const StringOutputParser());
final sentiment =
await chain.invoke({"text": "The food here is absolutely delicious!"});
// final res = promptTemplate
// .format({"text": " The food here is absolutely delicious!"});
// final sentiment = await chatModel.invoke(PromptValue.string(res));
print(sentiment); // Outputs: POSITIVE
}
Faking a conversation chain
With LangChain.dart, you can actually fake a conversation chain by defining different templates for different roles.
HumanChatMessage
: AChatMessage
coming from a human/user.AIChatMessage
: AChatMessage
coming from an AI/assistant.SystemChatMessage
: AChatMessage
coming from the system.FunctionChatMessage
/ToolChatMessage
: AChatMessage
containing the output of a function or tool call.
Once Gemini sees the style already developed in the conversation it’ll try to follow the same one and would reply with just the sentiment value we’re looking for.
Create a new file for this since it’s a bit different.
touch fake_conversation.dart
Let’s use the same model setup:
void main() async {
final chatModel = ChatGoogleGenerativeAI(
apiKey: Platform.environment["GOOGLEAI_API_KEY"],
defaultOptions: const ChatGoogleGenerativeAIOptions(
model: "gemini-1.5-pro",
temperature: 0,
),
);
...
Here's how we're setting up our new sentiment classification pipeline:
// Define the system message template
const systemTemplate = '''
Analyze the sentiment of the text below.
Respond only with one word to describe the sentiment.
''';
final fewShotPrompts = ChatPromptTemplate.fromPromptMessages([
SystemChatMessagePromptTemplate.fromTemplate(systemTemplate),
// Positive example
HumanChatMessagePromptTemplate.fromTemplate('I am so happy today!'),
AIChatMessagePromptTemplate.fromTemplate('POSITIVE'),
// Neutral example
HumanChatMessagePromptTemplate.fromTemplate('The sky is blue.'),
AIChatMessagePromptTemplate.fromTemplate('NEUTRAL'),
// Negative example
HumanChatMessagePromptTemplate.fromTemplate(
"I am very disappointed with the service."),
AIChatMessagePromptTemplate.fromTemplate('NEGATIVE'),
HumanChatMessagePromptTemplate.fromTemplate('I enjoy reading books.'),
]);
Let's break down what's happening:
System Message: We start with a clear instruction about what we want , a one-word sentiment description.
Few-Shot Examples: We're basically showing Gemini a mini-training set:
A positive sentiment example
A neutral sentiment example
A negative sentiment example
Invoking the model looks like this:
final aiChatMessage = await chatModel.call(fewShotPrompts.formatMessages());
We’re using .call
since we need to pass in a list of messages but we also need to get the messages in the form of List<ChatMessage>
so we call .formatMessages
on the prompt.
Which results in this output after running the file:
print(aiChatMessage.content); // Outputs: POSITIVE
The prompt templates can also be constructed in other ways.
final fewShotPrompts = ChatPromptTemplate.fromTemplates([
(ChatMessageType.system, systemTemplate),
(ChatMessageType.human, 'I am so happy today!'),
(ChatMessageType.ai, 'POSITIVE'),
(ChatMessageType.human, 'The sky is blue.'),
(ChatMessageType.ai, 'NEUTRAL'),
(ChatMessageType.human, "I am very disappointed with the service."),
(ChatMessageType.ai, 'NEGATIVE'),
(ChatMessageType.human, 'I enjoy reading books.'),
]);
The invocation remains the same for both so choose whichever for convenience.
final aiChatMessage = await chatModel.call(fewShotPrompts.formatMessages());
print(aiChatMessage.content); // Outputs: POSITIVE
Here’s the complete code sample:
import 'dart:io';
import 'package:langchain/langchain.dart';
import 'package:langchain_google/langchain_google.dart';
void main() async {
final chatModel = ChatGoogleGenerativeAI(
apiKey: Platform.environment["GOOGLEAI_API_KEY"],
defaultOptions: const ChatGoogleGenerativeAIOptions(
model: "gemini-1.5-pro",
temperature: 0,
),
);
// Define the system message template
const systemTemplate = '''
Analyze the sentiment of the text below.
Respond only with one word to describe the sentiment.
''';
// Define the few-shot examples
final fewShotPrompts = ChatPromptTemplate.fromPromptMessages([
SystemChatMessagePromptTemplate.fromTemplate(systemTemplate),
HumanChatMessagePromptTemplate.fromTemplate('I am so happy today!'),
AIChatMessagePromptTemplate.fromTemplate('POSITIVE'),
HumanChatMessagePromptTemplate.fromTemplate('The sky is blue.'),
AIChatMessagePromptTemplate.fromTemplate('NEUTRAL'),
HumanChatMessagePromptTemplate.fromTemplate(
"I am very disappointed with the service."),
AIChatMessagePromptTemplate.fromTemplate('NEGATIVE'),
HumanChatMessagePromptTemplate.fromTemplate('I enjoy reading books.'),
]);
// final fewShotPrompts = ChatPromptTemplate.fromTemplates([
// (ChatMessageType.system, systemTemplate),
// (ChatMessageType.human, 'I am so happy today!'),
// (ChatMessageType.ai, 'POSITIVE'),
// (ChatMessageType.human, 'The sky is blue.'),
// (ChatMessageType.ai, 'NEUTRAL'),
// (ChatMessageType.human, "I am very disappointed with the service."),
// (ChatMessageType.ai, 'NEGATIVE'),
// (ChatMessageType.human, 'I enjoy reading books.'),
// ]);
final aiChatMessage = await chatModel.call(fewShotPrompts.formatMessages());
print(aiChatMessage.content); // Outputs: POSITIVE
}
What if we just ask the model?
I’m sure you’re not totally convinced that we needed to do all that in order to get the sentiment and you’re right to be. Anyone can simply paste the texts in ChatGPT and get the same results. We probably did not need to give it examples or fake a conversation chain.
So how do we do things better since we’re Flutter/Dart developers and we definitely want precise, typed, structured outputs?
Enter prompting with a JSON schema, a technique that lets you get precise, structured output directly from the language model. This schema follows specification defined here: https://json-schema.org/
We do not want to lose our progress so far so let’s move to a new file.
touch zero_shot.dart
You can name the above differently if you want.
Now, let’s define an enum for clear sentiment categorization
enum Sentiment { positive, neutral, negative }
void main() async {
...
}
Now, we enhance our model configuration, mainly the ChatGoogleGenerativeAIOptions
final chatModel = ChatGoogleGenerativeAI(
apiKey: Platform.environment["GOOGLEAI_API_KEY"],
defaultOptions: ChatGoogleGenerativeAIOptions(
responseMimeType: "application/json",
model: "gemini-1.5-flash",
temperature: 0,
responseSchema: {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": Sentiment.values.map((e) => e.name).toList()
}
},
"required": ["sentiment"]
}),
);
We define our responseSchema
and then we pass it to the ChatGoogleGenerativeAIOptions
to ensure a Structured Output
Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.
Okay, what’s happening here?
responseMimeType: "application/json"
tells the model to return JSON
application/json
.responseSchema
defines exactly what we want: a JSON object with asentiment
propertyWe're dynamically creating an enum list from our
Sentiment
values
The "enum":
Sentiment.values.map
((e) =>
e.name
).toList()
part serves a specific purpose in the JSON schema because when you define an enum
in the JSON schema, you're essentially telling Gemini the exact set of allowed values for that field.
In this case, Sentiment.values.map
((e) =>
e.name
).toList()
converts the Dart enum values to a list of their string names.
This is our enum enum Sentiment { positive, neutral, negative }
If you were to print the values out:
print(Sentiment.values); // Outputs [Sentiment.positive, Sentiment.neutral, Sentiment.negative]
By including this in the schema, you're:
Constraining the model to only return these specific values
Providing clear guidance on what constitutes a valid response
Ensuring type safety by limiting the possible outputs
Without this constraint, Gemini might return random strings that don't match the enum. With the enum constraint, the model is guided to only return "Sentiment.positive", "Sentiment.neutral", or "Sentiment.negative".
This is perfect for this zero-shot prompting example because it gives the model clear boundaries while still allowing it to make an intelligent classification based on the input text. When you do this, even dumb AI models appear decent due to the constraint.
required: ["sentiment"]
ensures we always get a sentiment back
The prompt has gotten simpler as well:
final promptTemplate = PromptTemplate.fromTemplate(
'''
Analyze the sentiment of the text below.
Respond with a JSON object containing the sentiment.
Text: {text}
'''
);
Our chain now includes a JsonOutputParser
, since you know, we’re returning a JSON:
final chain = promptTemplate.pipe(chatModel).pipe(JsonOutputParser());
final resultJson = await chain.invoke({
"text": "The food here is absolutely delicious!"
});
final sentiment = Sentiment.values.byName(resultJson['sentiment']);
print(sentiment); // Outputs: Sentiment.positive
The
JsonOutputParser
takes the output of the previousRunnable
in the chain, converts it to a string, and then parses it as a JSON Map.
Here’s the complete code sample:
import 'dart:io';
import 'package:langchain/langchain.dart';
import 'package:langchain_google/langchain_google.dart';
// Enum to represent sentiment
enum Sentiment { positive, neutral, negative }
void main() async {
final chatModel = ChatGoogleGenerativeAI(
apiKey: Platform.environment["GOOGLEAI_API_KEY"],
defaultOptions: ChatGoogleGenerativeAIOptions(
responseMimeType: "application/json",
model: "gemini-1.5-flash",
temperature: 0,
responseSchema: {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": Sentiment.values.map((e) => e.name).toList()
}
},
"required": ["sentiment"]
}),
);
final promptTemplate = PromptTemplate.fromTemplate(
'''
Analyze the sentiment of the text below.
Respond with a JSON object containing the sentiment.
Text: {text}
''',
);
final chain = promptTemplate.pipe(chatModel).pipe(JsonOutputParser());
final resultJson =
await chain.invoke({"text": "The food here is absolutely delicious!"});
final sentiment = Sentiment.values.byName(resultJson['sentiment']);
print(sentiment); // Outputs: Sentiment.positive
}
The Embedding Classification Approach
Unlike the prompt-powered approach we’ve used so far, embeddings is a more sophisticated technique that transform the normal text matching problem into a vector space similarity computation.
From LangChain.dart:
Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.
So we’re storing semantic meaning in vectors as numbers and we’re going to compare which texts are the closest in meaning in this type of classification. This helps us to generate a more nuanced and contextually aware classification.
In the upcoming example, we're going to be building a book genre classifier that can take a new book description and automatically figure out which genre bucket it belongs in.
We’re not using ChatGoogleGenerativeAI
this time although we’re going to be re-using the api key since the Google API Key grants access to text embedding models like text-embedding-004
which is the default for GoogleGenerativeAIEmbeddings
in LangChain.dart as of 29th November, 2024.
As we’ve done previously above, create a new file in the lib
directory. I’m going to call mine book_classification.dart
touch book_classification.dart
With this at the start:
void main() async {
final embeddings = GoogleGenerativeAIEmbeddings(
apiKey: Platform.environment['GOOGLEAI_API_KEY'],
);
...
}
Document Setup
final documents = [
Document(
pageContent: '''
The Hobbit by J.R.R. Tolkien
A classic fantasy novel following the journey of Bilbo Baggins as he embarks on a quest to help dwarves reclaim their homeland from a dragon.,
Harry Potter and the Sorcerer's Stone by J.K. Rowling
A young wizard discovers his magical heritage and attends Hogwarts School, where he makes friends, uncovers secrets, and battles dark forces.,
The Name of the Wind by Patrick Rothfuss
The story of Kvothe, a gifted young man, and his rise from humble beginnings to a legendary figure.
''',
metadata: {'title': 'Fantasy'},
),
Document(
pageContent: '''
1984 by George Orwell
A dystopian novel set in a totalitarian society under constant surveillance, exploring themes of control, truth, and rebellion.,
Brave New World by Aldous Huxley
A chilling vision of a future society where individuals are conditioned to conform, and emotions and individuality are suppressed.,
Fahrenheit 451 by Ray Bradbury
A story about a fireman in a future society where books are banned and burned to suppress dissenting ideas.
''',
metadata: {'title': 'Dystopian'},
),
Document(
pageContent: '''
Pride and Prejudice by Jane Austen
A romantic novel about Elizabeth Bennet and her evolving relationship with the wealthy Mr. Darcy, set in 19th-century England.,
The Notebook by Nicholas Sparks
A tale of enduring love between Noah and Allie, spanning decades and overcoming obstacles.,
Me Before You by Jojo Moyes
A story about a young woman who becomes a caregiver for a paralyzed man, and the life-changing relationship they develop.
''',
metadata: {'title': 'Romance'},
),
];
We're creating a mini library of pre-classified books. Each book comes with a synopsis and a genre metadata tag. We’re adding a metadata tag because:
Google AI support specifying a document title when embedding documents. The title is then used by the model to improve the quality of the embeddings
Embedding Generation
Using GoogleGenerativeAIEmbeddings
, we'll transform these book descriptions into numerical vector representations. This helps us capture the semantic meaning in each document.
final documentsEmbeddings = await embeddings.embedDocuments(documents);
Once we get a “new” book that we want to classify, we do they same to it by generating an embedding for it.
final newBook = '''
**The Fellowship of the Ring by J.R.R. Tolkien**
This epic novel is the first installment of The Lord of the Rings trilogy.
It follows Frodo Baggins as he begins his journey to destroy the One Ring,
accompanied by a fellowship of friends and allies.
Themes:
* Adventure and camaraderie
* The battle between good and evil
* Sacrifice and heroism
Setting:
* The rich, fantastical world of Middle-earth, with locations like the Shire, Rivendell, and Moria.
Characters:
* Frodo Baggins, Samwise Gamgee, Aragorn, Gandalf, and more.
''';
final newBookEmbedding = await embeddings.embedQuery(newBook);
Now you can find the most similar pre-existing genre by comparing vector similarities.
final mostSimilarIndex =
getIndexesMostSimilarEmbeddings(newBookEmbedding, documentsEmbeddings)
.first;
The getIndexesMostSimilarEmbeddings
function finds which pre-existing book embeddings are most similar to our new book, using cosine similarity by default.
Calling .first
on the getIndexesMostSimilarEmbeddings
output means we get the index of the most similar document. Now that we have it we can get the category based on the most similar document.
final category = documents[mostSimilarIndex].metadata['title'];
print('The new book belongs to the genre: $category'); // Outputs: Fantasy
In terms of succinctness, you can tell that it’s more to the point than the JSON schema example above. The list of documents are the only things that take up space, and once the embeddings are generated you do not need to do so again for each new book you want to classify. Vector stores like MemoryVectorStore
or VertexAIMatchingEngine
makes things even easier as you work with larger documents.
Here’s the complete sample:
import 'dart:io';
import 'package:langchain/langchain.dart';
import 'package:langchain_google/langchain_google.dart';
void main() async {
final embeddings = GoogleGenerativeAIEmbeddings(
apiKey: Platform.environment['GOOGLEAI_API_KEY'],
);
final documents = [
Document(
pageContent: '''
The Hobbit by J.R.R. Tolkien
A classic fantasy novel following the journey of Bilbo Baggins as he embarks on a quest to help dwarves reclaim their homeland from a dragon.,
Harry Potter and the Sorcerer's Stone by J.K. Rowling
A young wizard discovers his magical heritage and attends Hogwarts School, where he makes friends, uncovers secrets, and battles dark forces.,
The Name of the Wind by Patrick Rothfuss
The story of Kvothe, a gifted young man, and his rise from humble beginnings to a legendary figure.
''',
metadata: {'title': 'Fantasy'},
),
Document(
pageContent: '''
1984 by George Orwell
A dystopian novel set in a totalitarian society under constant surveillance, exploring themes of control, truth, and rebellion.,
Brave New World by Aldous Huxley
A chilling vision of a future society where individuals are conditioned to conform, and emotions and individuality are suppressed.,
Fahrenheit 451 by Ray Bradbury
A story about a fireman in a future society where books are banned and burned to suppress dissenting ideas.
''',
metadata: {'title': 'Dystopian'},
),
Document(
pageContent: '''
Pride and Prejudice by Jane Austen
A romantic novel about Elizabeth Bennet and her evolving relationship with the wealthy Mr. Darcy, set in 19th-century England.,
The Notebook by Nicholas Sparks
A tale of enduring love between Noah and Allie, spanning decades and overcoming obstacles.,
Me Before You by Jojo Moyes
A story about a young woman who becomes a caregiver for a paralyzed man, and the life-changing relationship they develop.
''',
metadata: {'title': 'Romance'},
),
];
// Generate embeddings for the existing documents
final documentsEmbeddings = await embeddings.embedDocuments(documents);
// New book description to classify
final newBook = '''
**The Fellowship of the Ring by J.R.R. Tolkien**
This epic novel is the first installment of The Lord of the Rings trilogy.
It follows Frodo Baggins as he begins his journey to destroy the One Ring,
accompanied by a fellowship of friends and allies.
Themes:
* Adventure and camaraderie
* The battle between good and evil
* Sacrifice and heroism
Setting:
* The rich, fantastical world of Middle-earth, with locations like the Shire, Rivendell, and Moria.
Characters:
* Frodo Baggins, Samwise Gamgee, Aragorn, Gandalf, and more.
''';
// Generate embedding for the new book
final newBookEmbedding = await embeddings.embedQuery(newBook);
// Use LangChain's method to find the most similar embedding
final mostSimilarIndex =
getIndexesMostSimilarEmbeddings(newBookEmbedding, documentsEmbeddings)
.first;
// Get the category based on the most similar document
final category = documents[mostSimilarIndex].metadata['title'];
print('The new book belongs to the genre: $category'); // Outputs: Fantasy
}
Try to create a simple food classification project based on the knowledge you have so far. Some for Appetizer, Main, and Dessert. You can use any language model to generate fake food recipes and then see if you can predict the category for a “new” recipe.
Conclusion
In conclusion, integrating Gemini and LangChain.dart into your apps makes it easier for Flutter and Dart developers to implement text classification. By leveraging these frameworks/tools developer can bypass the complexities of traditional frameworks and models like those found in Python and JavaScript. By using prompt templates, conversation chains, and JSON schema for structured outputs, you get flexibility in handling classification. Additionally, the embedding classification approach further improves our capabilities to categorize text based on semantic similarities.
You can find all the examples here: https://github.com/Nana-Kwame-bot/langchain_examples
David has created an amazing package. Check out the documentation here:
And give the package a like if you enjoy what you've seen:
There’s going to be a few more articles in this LangChain.dart series, subscribe to my newsletter and follow me to get them as soon as I publish.