AI On Azure: Introducing Azure AI services — Building Basic AI App (1/n)

Madhusudhan Konda
7 min readFeb 14, 2024

In the last series of articles, we introduced ourselves to the world of AI from a developer/techie’s perspective. We learned, explored and experimented various frameworks by working with both closed source AI models such as OpenAI’s GPT-3.5 and GPT-4 to Meta’s Llama and CodeLlama as well as Mistral models.

In this series, I want to focus on Azure AI services — right from Azure’s OpenAI services to its support for foundational models (like Llama 2) as well as AI Vector search capabilities using AI Search (formerly Cognitive Search) to building CoPilots. We will also look at the Semantic Kernel framework (the one similar to the LangChain framework).

In the first one of the series, I’ll go over the setting up the Azure AI service.

Azure AI Services

Azure AI services is the latest umbrella of AI services offering from Microsoft for software developers to develop and deliver AI powered applications. We can build vector-search, language processing, language translation, speech and vision and other types of production-ready applications using Azure AI services module.

The Azure AI Studio is the entry point to develop, test, deploy, manage, maintain enterprise grade AI powered applications. It exposes all the models in one place:

All models are provided in one unified place — under Azure AI Studio

Confusingly, there are two more services in the Azure eco-system which will be eventually integrated into the Azure AI service in my view (The Azure AI Studio is in public preview mode at the time of writing this article — 10th Feb 2024).

But for completeness, I’m providing the info around these two services too (make sure you jump on to Azure AI Studio going forward though):

  • Azure OpenAI Service and
  • Azure Machine Learning Service

Azure OpenAI service

As the name suggests, the Azure OpenAI service deals with the OpenAI (the company behind ChatGPT) offerings: all the GPT-4, GPT 3.5 LLMs with a couple of variants and a text-embedding-ada-002 embedding model, as shown in the figure:

OpenAI’s available models in Azure OpenAI

These same models and functionality is now available in Azure AI Studio which we will take a peek shortly.

Azure Machine Learning Service

The Azure Machine Learning Service, on the other hand, is an advanced service that helps data scientists and engineers to bring in any non-OpenAI related LLMs, including custom/own models.

If we’d want to train our data using any of the Meta’s Lllama2 or Huggingface’s sentence transformers or Mistral’s mistral models, we can head over to AML service to do so.

The Azure AI Services is the way to go — and the Azure AI Studio is the web tool that’ll help us develop AI applications using both openAI and non-OpenAI models.

We will look at the focusing at the Azure AI Services and the corresponding Azure AI Studio in this article.

Getting started with Azure AI

Head over to your Azure Portal and login to your account. Most likely you’ll be asked to move to Pay-as-you-go subscription as the AI services are not enables on free-account.

Search for “Azure AI Studio” in the search bar and create a new Azure AI workspace:

Creating a new Azure AI Studio Workspace

There are a handful of services that AI workspace depends on — including:

  • Azure AI services
  • Azure Storage Account
  • Key Vault
  • ApplIcation Insights and
  • Azure Container Registry (ACR).

You have a choice of create a brand new instances of these relevant services or let the service create with suggested names (see below):

Letting AI Studio Service create the required dependent services

The rest of the options can be left as default as we will not be touching them at this point.

After a couple of minutes, we’ll have our Azure AI workspace created, which should be similar to the one shown in the image below:

The workspace provides configurational settings of the AI service on the whole — for example, it’ll help you to invoke the service via endpoints given under “Keys and Endpoints” menu item.

We are interested in working with the Azure AI Studio — so click on the “Launch Azure AI Studio” button to get the studio launched. This will take us to the “Manage” page where we will see the empty project details, permissions and group members etc.

Creating our first Project

Click on Details tab and then “View all” link in the projects tile. This should let us create a new project — give it a name (I’ve given spring-docs as the project name — this is an open source project that I’m currently working on during my spare time. This project is an AI assistant for Spring Framework docs:) — hopefully it’ll be out soon) and chose the “create a new AI hub” from the drop down for AI hub.

The next form will ask you to fill a few details — give them to the best of your knowledge. The project gets created shortly.

Awesome — the project gets created:

The AI project is ready for a roll!

Of course we need to create our model deployments and start playing with these deployments to build our AI powered app.

Deploying our models

Click on the Deployments in the left hand menu item, and then clicl on “Create” to deploy a model of your choice. The popup provides you all the available models — as shown below:

Select a model to deploy

Once you deploy a model, it’ll be available for us to start inferencing with it.

To use this model, we can go over to Playgrounds to start chatting with it. When you are on Playgrounds page, you’ll need to pick up the model on the right hand side where it says “Deployment” with a drop down option. Your deployed model will be available here.

As the image shows, the Playground helps you chat Q&A with the chosen LLM:

LLM in action

Grounding the LLM with private Data

The deployed model will surely help you retrieve answers/knowledge up until April 2023 as that was the cut off data for GPT-4 model. Not only having the latest info, it wouldn’t have any answers about our own private data — for example, pdf copy of my Elasticsearch in Action book :)

Usually LLMs make up an answer if they don’t have the information — which is called “hallucinations” in the AI world. We need to “ground” the LLM by providing the data that it needs to know — usually providing the pre-prepared context to answer from. This patten is called Retrieval Augment Generation (RAG). We will see this RAG pattern in detail towards the later part of this article series.

To ground our LLM with our data, we can upload our data and ask it to answer from the uploaded content only. To do this, in the Playground page, you should see “Add your data” tab next to the system message:

Uploading data to ground the LLM

Click on the the “Add a data source” and follow the instructions. Choose “Upload files” to upload a sample PDF from your local drive. You will need to create a few other resources including a AI search and blog storage resources (note: free tier AI search resource can’t be used — so you may have to choose at least Basic tier).

Adding dependent resources to upload our private data

In the next page, you’ll be presented with a upload file form — add your data here. Follow the instructions to get the data ingested. The ingestion process may take a couple of minutes, you would see the spinner in the Playgrounds page back where we started our “upload data” journey from.

Once the data is ingested successfully, our AI bot is ready to answer our questions — this time only from our data . To test the process I uploaded a document about Efficacy of Ayurveda medicine for treating Chronic Sinusitis — as one of my colleague was recently diagnosed with this condition :(

I asked about BTC, ETH, Rishi Sunak and others — it refused to answer me — as expected :)

But when I asked about Sinusitis, it provided me with good answers:

Grounded LLM

Including answering me in Spanish :)

Language translation out of the box!

How awesome it is! It’s time to wrap up for now — we will come back to taking it to next level in the next article. Stay tuned!

--

--

Madhusudhan Konda
Madhusudhan Konda

Written by Madhusudhan Konda

Madhusudhan Konda is a full-stack lead engineer, mentor, and conference speaker. He delivers live online training on Elasticsearch, Elastic Stack &Spring Cloud

Responses (1)