Concerns about computer programming with AI

A few months ago I decided to publish more content on this web site. When I published my first articles, the first response I received about those was "so refreshing to read an actual personally written article again, instead of generated by AI". I guess that is reality nowadays.

In 2023 Microsoft released CoPilot. First as general chat bot and later Microsoft integrated the functionality into development environments like Visual Studio (Code). It started as an AI code assistant suggesting improvements of the code a developer is writing, this later improved into Vibe coding products that can do a lot of programming for you.
Since then CoPilot, Cursor, Claude and other Vibe coding tools have been booming.

Almost every review I read about coding using AI, often referred to as Vibe coding, started with something like "I am not against Vibe coding" and then the pros and cons of using these tools.

Of course I have played with some of the vibe coding tools to see how it performs, in 2024 my experience was "promising but quite not there yet". I was unimpressed. However, things are moving fast in this space and earlier this year I tried again to make a small pilot project. I can see the results are becoming much better nowadays. People without much understanding of how to code can get results that at least deliver code thats does what they want.

After reading this article about someone stopped using AI code editors, my own reservations about using vibe coding were more confirmed.

Intellectual property

Almost every commercial company nowadays has some form of IT implemented in their company. This can be using only standard software, and often with (at least partly) custom programming and databases. This code and the content of these databases are the intellectual property of the business.
Often that is their unique value compared to competitors. For example, if a company uses some form of customer relation management software, the contents of the database of that software, all their clients and information about their clients is stored there.

This is very valuable information. I cannot imagine any company wants to give that information away and even pay for it.

In my professional environment I could not use vibe coding tools yet: when using AI assisted coding like like CoPilot or Claude, (snippets of) the code you write are sent to the AI provider to train their AI model(s). When I write code for my clients, usually the intellectual property rights are owned by the client. So when Microsoft started to push their CoPilot product in almost every product, I had to be sure no client code was going to leak into the AI training of CoPilot or whatever other AI assisted coding tool.
I am not in the position to make this decision for the companies I consult.

In the beginning of 2024 it was easy to do by disabling CoPilot in Microsoft Visual Code, but in the beginning of this year Microsoft started to push CoPilot so much that I could not trust anymore that no code I was writing for my clients was leaking to Microsoft, so I switched to VSCodium which is the same programming environment without all the Microsoft telemetry.

Not only do the vibe coding tools ask for your code, they now also want to have access to the database the code has to work with. I can understand that in a way, because most software only works perfectly in cooperation with their corresponding databases.
In a perfect world all companies and everybody only use mockup test data within their databases and keep production data in a complete separate environment of course, but be real, this often does not happen. It is a huge risk.

In the past a lot of programmers used tutorials to implement several services like AWS S3. Every tutorial says "for this demo we do not implement credentials of this service, but you should do it for your production project" and of course there were a lot of data leaks of data stored in AWS S3 buckets that should have never happened.
These security leaks still happen.

Consider the following scenario: you have an existing project and want to use to AI to make improvements on it. To accomplish this, you have to give the AI access to your code and database to make it understand on what to make improvements. Even if you use a database with mockup data, the AI still has access to your code and trains itself from it.
And what happens when there is a bug in the production environment with production data not covered in your mockup database? We are all humans, it could be faster to give the AI access to that database in the production environment too. And by a simple small configuration error you gave the AI provider all of your data.

The cause of the data breach is then basically a user error, not the AI, but I think we just have to wait until such data breaches happen. And if your data is in the hands of the AI company, good luck with that.

Creativity

Large Language Models (LLM) are very impressively good in recognizing patterns. It is trained on A LOT of data and based on that it can predict what mostly will be the next word it has present to you. Every LLM has a lot of parameters where it can be 'polished' to behave in a certain way. They also have a 'creativity' parameter that makes sure it generates a different answer each time a more complex question is asked. This way they do not give the exact same answer for every question to every user. (It works somewhat different of course, but I hope you get the picture).

You can ask a LLM to write a story for you, or just a complete book and it can be very appealing. But somehow all those created texts often feel somewhat 'off'. The first responses I got on my articles mentioned on the top of this article are an expression of that. There is even a whole science evolving around that.

If I ask a LLM to setup a Python FastAPI project for me, it creates a file structure that works. Very handy, I get a speed jump starting a new project and that is a benefit for a programmer. But if a whole program or web site is generated by AI you do not get the creativity needed for that program, you get the average code the LLM was trained on , usually by getting the code from Github. But all that publicly available code it was trained on, also has A LOT of bad code the LLM is also trained on.

A LLM basically works like the principe of wisdom of the crowd that generally leads to good results. Never the best result, but good enough.
Is that a bad idea? Probably not, if a company wants to hire the best programmer, it will probably also decide to hire one that is 'good enough' because the best is probably the most expensive.

By using vibe coding you by definition accept you only get results that are at best 'good enough'.

A human programmer will try to deliver the best in the time he has to accomplish a task. It will probably contain bugs, it will be a different from the average generated by an AI tool, but the programmer will have thought of it, and probably has put one or two really smart decisions in the code, and probably kept other (future) requirements in mind to create it.
It will probably have more use for the company he is writing it for, and thus have higher quality. Basically, he puts his creativity into it.

If a writer of a novel writes a book, he wants his book to be good. Not because is also sells better, but he/she will be proud of it. If you tweak a LLM to write books that sell the best, they will turn out more and more of the same.

Business case of vibe coding tools

AI is a hype at the moment. In the beginning of the internet, a lot of Internet Service Providers (ISP) provided free internet access in which the users became the product. The internet was still small at that time, investments in the networks were high, and ISP's needed money to grow. So they gave the access to the internet away for free to leverage maximum growth to get as much users as possible.
The ISP's sold the customer profile data of their customers to advertisement companies which then could build more specific profiles of all those new online users to target ads more efficiently. This made the internet profitable very quickly.
After this initial hype was over, free ISP's basically disappeared and we now all pay for internet access to view ads served by those advertisement companies that have a more and more detailed profile of each person online every day.

Another example is the cloud computing business. When the large cloud providers started, they only had a minimal viable product they sold for cheap, even for free. AWS, Google and Microsoft still offer free services, just enough to try something out, but when you really want to use services they offer, you pay more for it now than hosting it yourself. Nowadays a lot of businesses fully rely on the cloud, resulting in vendor lock-in and almost no way back.

AI companies use the same strategy.

In the beginning, using the LLM's was free, as the use of ChatGPT sky rocketed, more advanced functionality became paid very quickly. They invest billions of dollars in it, but it not profitable yet probably for all AI companies. AI looks exactly like the Dot-Com Bubble. What they now offer is a low price for the services they offer.

All the AI companies promise their large language models (LLM) will eventually lead to a form of Artificial General Intelligence (AGI), but it is becoming clearer and clearer these LLM's are hitting a wall to achieve that.
All the investments the AI companies now make to achieve this, must become profitable in the near future, so I am afraid those prices will become much higher in the future.

As with Vibe coding, AI companies want you to believe coding is not necessary any more.

In the same time as AI became more and more a hype, a lot of programmers were losing their jobs. Not only because the AI was becoming better and better, but by a large part because those engineers were becoming too expensive for the big tech companies and AI was a better excuse to blame. Vibe coding tools have also a free tier, but if you really want to accomplish thing you pay by using tokens you have to buy. This is a never ending story.

This means less jobs for computer programmers, and the AI companies happily jump into the gap, because companies still need programming roles and AI is now cheap.

But what if there are no computer programming roles left? What if AI becomes so expensive it will cost more than a programmer costed in 2024? The answer you usually get is that with new technology there is a shift of jobs in the long term. There are now people needed in new roles like AI prompting roles?

I have doubts in this case. Vibe coding does not seem the holy grail to me and the AI companies want to replace too much jobs, not only computer programmers. The costs of a generic business will rise in the long term only benefiting the few AI big tech companies. Of course I cannot predict the future, but what happened with the large cloud providers and their vendor lock-in makes me nervous.

My advice for now

  • Be very careful of giving away your intellectual property to AI companies, especially the contents of your databases.
  • Make use of AI as a tool to increase productivity at non important parts of your company at least for now, do not rely on vibe coding only yet.

Links