How to build AI features in 2024
disclaimer: The examples are based on real situations, but they were changed to preserve the anonymity of the people involved as well as the ideas discussed. All the client projects were changed to a similar one.
Last month I published a blogpost sharing a few lessons I learned from building AI products. While discussing it with some peers, I found out that most people don’t understand the different options we have when it comes to adding AI capabilities to a product. As an example, as I was talking to a CTO last week, he mentioned he would hire a machine learning engineer to build a proof of concept for him. As he explained the problem to me, someone who had already built many, I couldn’t help but think that there were faster, cheaper and more efficient ways of solving his problem. To help him, and others facing this same doubt, today I would like to talk about what are the different cases for building AI functionalities in 2024.
During my time building products, I saw AI related challenges that can be categorised into 3 different situations.
1. You don't need AI at all
As most applications today can prove this to us, most functionalities don’t require AI. They can make it easier to implement in the short term, but might mean keeping an unnecessary long term cost by having a custom LLM or paying for a cloud provider. This is the situation for most AI requirements I’ve seen. My rule of thumb is: If the data input has tabular data, the ranges are known and the goal is to automate cases like “if this then that”, then it might not need AI at all.
Case study:
Once when doing a project discovery with a customer, he mentioned that he wanted to have an AI integrated dashboard for his manufacturing floor. The application would consume data from sensors using the OPC UA protocol, and we would display this data into a dashboard for the engineers that were managing a plastic manufacturing plant. The idea was simple, this dashboard would display all the manufacturing related data necessary for engineers to adjust the production in case any of the sensors showed something was wrong. For example, if the temperature in the boiler was too high or too low, it would show a warning that the engineers had to change the temperature to be on a specific range. His goal was to use AI to determine if, based on the data, everything was on track or not. While thinking about this specific problem, I identified a few things:
a) We know what the correct temperature is
b) We know what the wrong temperature is
c) We know how to fix the problem
Because this information is known and fixed, all we needed were conditionals in place to determine if these parameters are correct or not, meaning that using AI would bring an unnecessary cost increase in building and maintaining it. We could still use AI for unknown situations where we would expect it to find tendencies using past information, but then it would be a different problem to solve.
2. You can use cloud providers with LLMs
There are cases in which an AI model is indeed the best option. Most of the time they are related to pattern matching and/or chatbots. We see this on bots that have the goal of providing the user with information, generating new content based on training data and so forth. When thinking about these, we need to reason around the cost of building our own models, deploying our models or open sourced ones on our own servers, or using cloud provided options. When thinking about building a custom model or using an open sourced one, the first question to answer is: was this problem solved well by another model? If the answer is yes then you might as well use the one that is ready. This will be the case of most chatbots. However, if you are solving a very specific problem i.e: music generation, then you might as well build your own model.
When the question is where to host the model, we need to think about the privacy around the data we’ll upload to it. Hosting the model on our own servers will always be the most “private” option available. The closer we are to the bare metal, the bigger the control we have about the data. With that being said, it is expensive to maintain servers. When facing this problem, I ask myself how private will this information be? If I’m dealing for example, with financial information, then I would for sure work with my own models. In case I’m managing less sensitive data, I would go with a cloud provider I trust, like AWS.
Case study:
One of my life long goals for Codelitt, was for it to have its own company document chatbot. A place where the employees could ask questions about the company. This required some sort of intelligence that would read all of our documents and be able to reason about what would be the best answer. This means having access to a predefined set of documents, and being capable of writing an answer to it. Because we know all the data necessary, we didn't really need to train an AI model to do it for us, we just needed to feed our data into a LLM and that's it. Also, because this data, while confidential, isn't a damaging one, we could feed it to a cloud provider we trust and then get it to work for our employees at a really low cost. We ended up implementing it using our open source API and OpenAI.
3. You need your own model
This is the last one, and the one I avoid the most. In some specific situations we might need a custom model. The reason I avoid it is because of cost, time to implement it and the risk of the implementation just not being viable at all. This is where we see most companies losing money and not getting any return. The challenges on building a custom AI model are:
- It is hard to find specialists that deliver on it.
- It requires an incredible amount of data, and often it just doesn’t exist.
- There is a chance that even if you have both a specialist and the data, you might simply not get the results you want.
I leave it as the last resource. With that being said, many companies are growing at a fast pace due to their ability to solve very specific problems and being the best at it.
Case study 1:
Last year, we had a client that wanted to create an application that, given a geolocation, a property address and pictures of the property, spit out a high level evaluation of the given property after renovation. While the problem seems to be simple at a glance, it is incredibly complex and requires a lot of data. To test train this tool we extracted available public data and data from property management APIs. Unfortunately, after testing it, we found out that there isn’t enough data on appraisals available on the market that makes it feasible to address this problem yet.
Case study 2:
Once we had a prospect reaching out to us because he wanted to create an app to help increase his sales. His business model was built around selling specific bricks for construction. He could sell simple bricks, but his main income came from selling specific bricks that aren't fabricated anymore. Customers that were remodelling their house and wanted these specific bricks would reach out to him with pictures of the ones they needed, and he or an employee would go through their inventory and see if they had it. If they didn't, they would need to look at their catalogue and see where they could find it to resell it. Our goal was to build an AI model that would be fed the picture of the tile by the customer and then check the inventory pictures to identify the one the customer is talking about, then provide an answer if it is available to be sold. This was a very interesting and feasible case, however, the customer wasn’t ready to spend the time and money to get enough pictures of his inventory and decided not to continue with this project.
All in all the market is feeling a big pressure to add AI capabilities to their products, but choosing the right approach is a huge step in setting up the application for success or failure.