This article is a step-by-step guide to introduce you to Large Language Models (LLMs) in Java applications using LangChain4j. We will learn how to install Llama 3 ML on a local machine and how to connect and use it from a Java application.
What are Large Language Models?
Firstly, some terms: With Large Language Model (LLM) we refer to a type of artificial intelligence (AI) model trained on a massive dataset of text and code. LLMs can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.
For example, GPT-3 (Generative Pre-trained Transformer 3) by OpenAI is one of the most famous LLM. You can use it online with a free plan or sign for a plan and access it from your applications using an OpenAPI Key.
In our example, we will be using Llama 3 ML, which is a large language model (LLM) developed and released by Meta AI. It’s part of a family of LLMs called Llama, with Llama 3 being the latest and most advanced version.
One of the key features of Llama 3 is that it’s open-source. This means the code behind the model is publicly available, allowing researchers and developers to access, study, and potentially modify or improve it. This fosters collaboration and innovation in the field of LLMs.
That being said, let’s start by installing LLama 3 locally with just a few simple steps!
- Firstly, reach Ollama’s download page: https://ollama.com/download
- There, choose the installer for your Operating System. For Windows machines, just download the executable file.
On a Linux machine, you can install it as follows:
curl -fsSL https://ollama.com/install.sh | sh
- Finally, verify that your installation is successful by requesting the URL: http://localhost:11434/
You should see the following status on the browser or on curl:
Ollama is running
Testing the Model with LangChain4j
You can interact with a LLM in many ways and in many languages. If you are a Java developer, one of the simplest way to integrate your application with LLMs is the LangChain4j framework. LangChain4j empowers Java developers to seamlessly integrate Large Language Models (LLMs) and embedding stores into their applications.
LangChain4j provides an unified API for different LLM providers (OpenAI, Google Vertex AI) and embedding stores (Pinecone, Vespa). You can use a single API, just as you would do in Hibernate to interact with multiple Databases.
Besides, LangChain4j has an extensive toolbox provides a wide range of tools for common LLM operations, from low-level prompt templating, memory management, and output parsing, to high-level patterns like Agents and RAG. In this article, we will just scratch the surface with a simple access using a Java Class.
Where do we start from? Clearly from the libraries. In order to use langchain4j in your project, you need to include the following dependency:
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j</artifactId> <version>0.31.0</version> </dependency>
Then, you need to add the library for your specific Chat Model. For example, in order to use Llama Chat Model, include the following library:
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-ollama</artifactId> <version>0.31.0</version> </dependency>
Much the same way, you need to include the OpenAI Model library if you want to use the OpenAI Model instead:
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai</artifactId> <version>0.27.1</version> </dependency>
In the following examples, we will be using JBang to simplify the project set up. You can find a getting started guide about JBang here: JBang: Create Java scripts like a pro .
On the other hand, if you prefer using a Maven or Gradle project, just include the langchain4j and langchain4j-ollama artifacts in your project.
A simple AI Chat
The first Class we will test is a simple Chat Model which uses the OllamaChatModel
and the Server that is running on localhost:11434:
//JAVA 21 //DEPS dev.langchain4j:langchain4j:0.31.0 //DEPS dev.langchain4j:langchain4j-ollama:0.31.0 import java.time.Duration; import dev.langchain4j.model.chat.ChatLanguageModel; import dev.langchain4j.model.ollama.OllamaChatModel; public class BasicDemo { private static final String MODEL = "llama3"; private static final String BASE_URL = "http://localhost:11434"; private static Duration timeout = Duration.ofSeconds(120); public static void main(String[] args) { ChatLanguageModel model = OllamaChatModel.builder() .baseUrl(BASE_URL) .modelName(MODEL) .timeout(timeout) .build(); System.out.println("Welcome to Llama 3! Ask me a question !"); String question = System.console().readLine(); String answer = model.generate(question); System.out.println(answer); } }
When you run this Class, it will prompt with a Question and will reply before quitting:
jbang BasicDemo.java
Using a Memory Chat
Our previous example will not store the information across the prompts. Therefore, you cannot use the AI Model to perform further generations of a topic you are discussing.
This Java class sets up a conversational AI using the LangChain4j library.:
//JAVA 21 //DEPS dev.langchain4j:langchain4j:0.31.0 //DEPS dev.langchain4j:langchain4j-ollama:0.31.0 import java.time.Duration; import dev.langchain4j.chain.ConversationalChain; import dev.langchain4j.memory.ChatMemory; import dev.langchain4j.memory.chat.MessageWindowChatMemory; import dev.langchain4j.model.chat.ChatLanguageModel; import dev.langchain4j.model.ollama.OllamaChatModel; public class ChatDemo { private static final String MODEL = "llama3"; private static final String BASE_URL = "http://localhost:11434"; private static Duration timeout = Duration.ofSeconds(120); public static void main(String[] args) { ChatLanguageModel model = OllamaChatModel.builder() .baseUrl(BASE_URL) .modelName(MODEL) .timeout(timeout) .build(); ChatMemory chatMemory = MessageWindowChatMemory.withMaxMessages(20); ConversationalChain chain = ConversationalChain.builder() .chatLanguageModel(model) .chatMemory(chatMemory) .build(); String answer = chain.execute("Hello my name is Francesco!"); System.out.println(answer); answer = chain.execute("Do you remember my name?"); System.out.println(answer); } }
Here are the two key objects that allow a Conversational set of Prompts:
ChatMemory (chatMemory
): An instance of MessageWindowChatMemory
with a maximum of 20 messages. This object maintains the context of the conversation, enabling the model to remember previous interactions within the specified message window.
ConversationalChain (chain
): This is an instance of ConversationalChain
, built with the chat language model and chat memory. It represents the complete setup for handling a conversational interaction, incorporating both the model and memory to generate context-aware responses.
As you can see from the output, the Model remembers my name, which was part of the first prompt:
jbang ChatDemo.java
Conclusion
In this tutorial, we have successfully demonstrated how to set up and test a conversational AI model using LangChain4j and llama3. In the next tutorials, we will show some possible applications of a LLM, such as summarizing an entire book or how to create a Quarkus application that leverages LangChain4j. Stay tuned!