Hi there 馃憢

Welcome to James Malcolm’s blog. Here I share some thoughts on data, business and the combination of the two.

How does Trustpilot scoring work?

Trustpilot is a wide-spread consumer review platform. Founded in 2007, they have over 1m reviews posted every month from consumers. Reviewers share their experiences with a company, giving a score, from one to five and a free-text comment. Companies proudly display the scores on emails, websites, and other mediums as a form of social proof. As such, it鈥檚 important to understand how Trustpilot works and how you can influence the scores....

December 19, 2023 路 4 min 路 James Malcolm

Creating Private LLMs

I want to open this post by stating that privacy within large language models (LLMs) is a mammoth topic that spans much more than can be said in a single post. Instead, I want to narrow the focus of the post to showcase some approaches of introducing proprietary data into LLMs, with privacy and safety of sensitive data at the forefront. In a study done by the AI Accelerator Institute, data privacy was the second biggest barrier to adopting LLMs within their company...

December 18, 2023 路 4 min 路 James Malcolm

What's next? Next word prediction with PyTorch

Today, I will take you through a simple next-word prediction model built using PyTorch. This next word prediction is based on Google鈥檚 Smart Compose and is a form of language modelling. The knowledge learnt here forms the basis for larger large language models despite using a different architecture. Specifically, we draw on research published by Google for Gmail鈥檚 Smart Compose feature. Smart Compose uses a few words the user inputs and then predicts the following words or sentences in emails you want to write....

November 8, 2023 路 7 min 路 James Malcolm

Handling multiple interactions with Langchain

There are many tutorials on getting started with Langchain and LLMs to create simple chat applications. I want to go slightly beyond this post and go into a bit of detail on the role of memory has in chat applications, and lastly touch on how you can scale your application across multiple sessions and multiple users. What is Langchain? Langchain is an open-source python package that helps in creating LLM solutions....

October 24, 2023 路 5 min 路 James Malcolm

Counting Pennies - Deploy or buy GenAI?

In this post, we explore the cost of deploying or buying your generative AI. Specifically, I want to focus on the computing cost - not the additional costs which contribute to the total cost of ownership. In this, I want to explore three options, these are: Managed: Use OpenAI directly Self-managed: Deploy using AWS Self-managed: Deploy using Google Cloud This post is part of my wider LLM series. Handling multiple interactions with Langchain LLM Risks - Prompt Injection Or a full list of posts, available here....

August 7, 2023 路 5 min 路 James Malcolm