Trusting your own judgement on AI is a huge risk
www.baldurbjarnason.com/2025/trusting-your-own-…
The article explains the general problem on the example of software development. But given that AI models are heavily promoted by billion-dollar US companies and important actors in that space are not at all friendly to the European Union, I think the relevance can be far larger.
Generally, the article explains that judging usefulness of AI models, specifically LLMs, by trying them out is very prone to the same psychlological traps like astrology, tarot cards or psychics - the so-called Barnum effect. This is specifically because these models are carefully engineered to produce plausible-sounding andwers! And even very intelligent but unaware people can easily fall prey to it.
5 Comments
Comments from other communities
Well, long walk for a short drink of water... Moreover, the point around the anecdotal Cloudflare Developer's self experiment seems to be "Don't trust in hearsay" rather than "don't self experiment", as the major gripes with the experiment are all related to external lack of trust in the experimenter's applied scientific methods. I get the point of not overestimating your own ability to think critically in this regard, but I wouldn't discard self experimentation per se, as it could always lead to further study.
Although, I believe we can educate people about the truths of AI, but I am scared to trust the corporations or the governments with it.
Although, I can recommend a book called AI Snakeoil by Arvind Narayanan and Sayash Kapoor, which is written for a layman that might help to understand limitations of AI.
I don't know what to think anymore. I guess I'll have to ask ChatGPT what to do.
What called my attention is that assessments of AI are becoming polarized and somewhat a matter of belief.
Some people firmly believe LLMs are helpful. But programming is a logical task and LLMs can't think - only generate statistically plausible patterns.
The author of the article explains that this creates the same psychological hazards like astrology or tarot cards, psychological traps that have been exploited by psychics for centuries - and even very intelligent people can fall prey to these.
Finally what should cause alarm is that on top that LLMs can't think, but people behave as if they do, there is no objective scientifically sound examination whether AI models can create any working software faster. Given that there are multi-billion dollar investments, and there was more than enough time to carry through controlled experiments, this should raise loud alarm bells.
The problem, though, with responding to blog posts like that, as I did here (unfortunately), is that they aren’t made to debate or arrive at a truth, but to reinforce belief. The author is simultaneously putting himself on the record as having hardline opinions and putting himself in the position of having to defend them. Both are very effective at reinforcing those beliefs.
A very useful question to ask yourself when reading anything (fiction, non-fiction, blogs, books, whatever) is “what does the author want to believe is true?”
Because a lot of writing is just as much about the author convincing themselves as it is about them addressing the reader. ...
There is no winning in a debate with somebody who is deliberately not paying attention.
This is all also a great argument against the many articles claiming that LLMs are useless for coding, in which the authors all seem to have a very strong bias. I can agree that it's a very good idea to distrust what people are saying about how programming should be done, including mistrusting claims about how AI can and should be used for it.
We need science #
Our only recourse as a field is the same as with naturopathy: scientific studies by impartial researchers. That takes time, which means we have a responsibility to hold off as research plays out
This on the other hand is pure bullshit. Writing code is itself a process of scientific exploration; you think about what will happen, and then you test it, from different angles, to confirm or falsify your assumptions. The author seems to be saying that both evaluating correctness of LLM output and the use of Typescript is comparable to falling for homeopathy by misattributing the cause of recovering from illness. The idea that programmers should not use their own judgment or do their own experimentation, that they have no way of telling if code works or is good, to me seems like a wholesale rejection of programming as a craft. If someone is avoiding self experimentation as suggested I don't know how they can even say that programming is something they do.
Writing code is itself a process of scientific exploration; you think about what will happen, and then you test it, from different angles, to confirm or falsify your assumptions.
What you confuse here is doing something that can benefit from applying logical thinking with doing science. For exanple, mathematical arithmetic is part of math and math is science. But summing numbers is not necessarily doing science. And if you roll, say, octal dice to see if the result happens to match an addition task, it is certainly not doing science, and no, the dice still can't think logically and certainly don't do math even if the result sometimes happens to be correct.
For the dynamic vs static typing debate, see the article by Dan Luu:
https://danluu.com/empirical-pl/
But this is not the central point of the above blog post. The central point of it is that, by the very nature of LKMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-exoerimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
And the quibbling about what "thinking" means is just showing that the arguments pro-AI has degraded into a debate about belief - the argument has become "but it seems to be thinking to me" even if it is technically not possible and also not in reality observed that LLMs apply logical rules, cannot derive logical facts, can not explain output by reasoning , are not aware about what they 'know' and don't 'know', or can not optimize decisions to multiple complex and sometimes contradictory objectives (which is absolutely critical to sny sane software architecture).
What would be needed here are objective controlled experiments whether developers equipped with LLMs can produce working and maintainable code any faster than ones not using them.
And the very likely result is that the code which they produce using LLMs is never better than the code they write themselves.
What you confuse here is doing something that can benefit from applying logical thinking with doing science.
I'm not confusing that. Effective programming requires and consists of small scale application of the scientific method to the systems you work with.
the argument has become “but it seems to be thinking to me”
I wasn't making that argument so I don't know what you're getting at with this. For the purposes of this discussion I think it doesn't matter at all how it was written or whether what wrote it is truly intelligent, the important thing is the code that is the end result, whether it does what it is intended to and nothing harmful, and whether the programmer working with it is able to accurately determine if it does what it is intended to.
The central point of it is that, by the very nature of LKMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-exoerimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
I feel like "not even possible to assess their usefulness for programming by self-exoerimentation(!)" is necessarily a claim that reading and testing code is something no one can do, which is absurd. If the output is often correct, then the means of creating it is likely useful, and you can tell if the output is correct by evaluating it in the same way you evaluate any computer program, without needing to directly evaluate the LLM itself. It should be obvious that this is a possible thing to do. Saying not to do it seems kind of like some "don't look up" stuff.
Are you saying that it is not possible to use scientific methods to systematically and objectively compare programming tools and methods?
Of course it is possible, in the same way as it can be inbestigated whuch methods are most effective in teaching reading, or whether brushing teeth is good to prevent caries.
And the latter has been done for comparing for example statically vs dynamically typed languages. Only that the result there is so far that there is no conclusive advantage.
Are you saying that it is not possible to use scientific methods to systematically and objectively compare programming tools and methods?
No, I'm saying the opposite, and I'm offended at what the author seems to be suggesting, that this should only be attempted by academics, and that programmers should only defer to them and refrain from attempting this to inform their own work and what tools will be useful to them. An absolutely insane idea given that the task of systematic evaluation and seeking greater objectivity is at the core of what programmers do. A programmer should obviously be using their experience writing and testing both typing systems to decide which is right for their project, they should not assume they are incapable of objective judgment and defer their thinking to computer science researchers who don't directly deal with the same things they do and aren't considering the same questions.
This was given as an example of someone falling for manipulative trickery:
A recent example was an experiment by a CloudFlare engineer at using an “AI agent” to build an auth library from scratch.
From the project repository page:
I was an AI skeptic. I thought LLMs were glorified Markov chain generators that didn’t actually understand code and couldn’t produce anything novel. I started this project on a lark, fully expecting the AI to produce terrible code for me to laugh at. And then, uh… the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked.
But understanding and testing code is not (necessarily) guesswork. There is no reason to assume this person is incapable of it, and no reason to justify the idea that it should never be attempted by ordinary programmers when that is the main task of programming.
I use AI to make gtk shell widgets for my Linux rice. It's definitely not as good as a experienced ricer but it can give good boilerplates. At the end I have to trouble shoot multiple logic errors but it's better than writing all that spaghetti at to myself.
Other than that the only use case I find for AI in coding is to crosscheck my code or make it generate tests for me. Even that is very rare.
My justification: I use AI because I don't want to write 1000-5000(combined) lines of code for a simple dock widget that can do couple of custom actions I use. Also the guarantee that the shell(ignis, ags) I use today can become the old thing very quickly, so I don't like spending much time.
I fear this is a problem that may never be solved. I mean that people of any intelligence fall for the mind's biases.
There's just too little to be gained feelings-wise. Yeah, you make better decisions, but you're also sacrificing "going with the flow", acting like our nature wants us to act. Going against your own nature is hard and sometimes painful.
Making wrong decisions is objectively worse, leading to worse outcomes, but if it doesn't feel worse (because you're not attributing the effects of the wrong decisions to the right cause, i.e. acting irrationally), then why should a person do it. If you follow the mind's bias towards attributing your problems away from irrationality, it's basically a self-fulfilling prophecy.
Great article.
Reponding to another comment in [email protected]:
Writing code is itself a process of scientific exploration; you think about what will happen, and then you test it, from different angles, to confirm or falsify your assumptions.
What you confuse here is doing something that can benefit from applying logical thinking with doing science. For exanple, mathematical arithmetic is part of math and math is science. But summing numbers is not necessarily doing science. And if you roll, say, octal dice to see if the result happens to match an addition task, it is certainly not doing science, and no, the dice still can't think logically and certainly don't do math even if the result sometimes happens to be correct.
For the dynamic vs static typing debate, see the article by Dan Luu:
https://danluu.com/empirical-pl/
But this is not the central point of the above blog post. The central point of it is that, by the very nature of LLMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-experimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
And the quibbling about what "thinking" means is just showing that the arguments pro-AI has degraded into a debate about belief - the argument has become "but it seems to be thinking to me" even if it is technically not possible and also not in reality observed that LLMs apply logical rules, cannot derive logical facts, can not explain output by reasoning , are not aware about what they 'know' and don't 'know', or can not optimize decisions for multiple complex and sometimes contradictory objectives (which is absolutely critical to any sane software architecture).
What would be needed here are objective controlled experiments whether developers equipped with LLMs can produce working and maintainable code any faster than ones not using them.
And the very likely result is that the code which they produce using LLMs is never better than the code they write themselves.
Another similar article that's really good:
If you have to use AI - maybe your work insists on it - always demand it cite its sources, hope they are relevant, and go read those instead.
What's the difference between copying a function from stack overflow and copying a function from a llm that has copied it from SO?
LLM are sort of a search engine with advanced language substitution features nothing more nothing less.
Because it's not a plain copy but an Interpretation of SO.
With llm you just have one more layer between you and the information that can distort that information.
And?
The issue is that you should not blindly trust code. Being originally written by a human being is not, by any means, a quality certification.
That is actually missing an important issue, hallucinations.
Copying from SO means you are copying from a human who might be stupid or lie but rarely spews out plausible sounding hot garbage (not never though) and because of other users voting and reputation etc etc, you actually do endup with a decently reliable source.
With an LLM you could get something made up based on nothing related to the real world. The LLM might find your question to be outside of it's knowledge but instead of realizing it it would just make up what it thinks sounds convincing.
It would be like if you asked me how that animal that is half horse and half donkey is called and instead of saying "shit i'm blanking" I would say "Oh, that is called a Drog" and I couldn't even tell you that I just made up that word because I will now be convinced that this is factual. Btw it's "mule"
So there is a real difference until we solve hallucinations, which right now doesn't seem solvable but at best reduced to insignificance (maybe)
That's why you meed to know the cavieats of the tool you are using.
LLM hallucinate. People willing to use them need to know, where is more prone to hallucinate. Which is where the data about the topic you are requesting is more fuzzy. If you ask for the capital of France is highly unlikely you will get an hallucination, if you as for the color of the hair of the second spouse of the fourth president of the third French republic, you probably will get an hallucination.
And you need to know what are you using it for. If it's for roleplay, or any not critical matters you may not care about hallucinations. If you use them for important things you need to know that the output needs to be human reviewed before using it. For some things it may be worth the human review as it would be faster that writing from zero, for other instances it may not be worth it and then a LLM should not be used for that task.
As an example I just was writing some lsp library for an API and I tried the LLM to generate it from the source documentation. I had my doubts as the source documentation is quite bigger that my context size, I tried anyway but I quickly saw that hallucinations were all over the place and hard to fix, so I desisted and I've been doing it myself entirely. But before that I did ask the LLM how to even start writing such a thing as it is the first time I've done this, and the answer was quite on point, probably saving me several hours searching online trying to find out how to do it.
It's all about knowing the tool you are using, same as anything in this world.
LLM are poor snapshots of a search engine with no way to fix any erroneous data. If you search something on Stack you get the page with several people providing snippets and debating the best approach. The LLM does not give you this. Furthermore if the author goes back and fixes an error in their code the search will find it whereas the LLM will give you the buggy code with no way to reasonably update it
LLM have major issues and even bigger limitations. Pretending they are some panacea is going to disappoint.
LLM also does not bully you for asking. Nor it says "duplicated question" for non duplicated questions... There's a reason people prefer LLM to SO nowadays.
It's not panacea. But it's not the doom world destroying useless machine that some people like to tell it is.
It's a useful tool for some task if you know how to use it. Everyone who actively use it is because we have find put that it works for us better than other tools for that task, of not we would not use it.
Giving my own personal experience, I tend to ask first to an LLM rather that what I used to do digging in old SO answers because I get the answer quicker and a lot of the times just better. It's not perfect by any stretch of the imagination, but it serves me a purpose.
For instance last week I needed a PowerShell command to open an app compatibility menu from the command line. I asked and got this as a response:
(New-Object -ComObject Shell.Application).Namespace((Split-Path "C:\Ruta\A\TuPrograma.exe")).ParseName((Split-Path "C:\Ruta\A\TuPrograma.exe" -Leaf)).InvokeVerb("P&roperties")
Worked at first try, exactly as I wanted.
You are free to try a search engine with the query "PowerShell command to open an app compatibility menu from the command line" and check for yourself how little help the firsts results get you.
It's a tool, as many others. The magic lies in knowing when and how to use it. For other things I may not use it, but after a couple of years using it I'm developing a good sense of which questions does it handle well and which questions is better not even to try.
It takes an enormous amount of energy and processing power to create these shitty snapshots so in many ways it is doom considering it will dramatically increase our energy usage.
I get it, you are an AI supporter but you fail to critically analyze it or even understand it. What tool would you use that you can't correct errors to or even determine how it works. You are really operating on faith here that the black box your getting an answer from is giving you the correct answer.
Perhaps a code snippet works, but after this is where it all falls apart. What if the snippet does not work or causes a problem. The LLM has nothing to offer you here.
LLMs can’t think - only generate statistically plausible patterns
Ah still rolling out the old "stochastic parrot" nonsense I see.
Anyway on to the actual article... I was hoping it wouldn't make these basic mistakes:
[Typescript] looks more like an “enterprise” programming language for large institutions, but we honestly don’t have any evidence that it’s genuinely more suitable for those circumstances than the regular JavaScript.
Yes we do. Frankly if you've used it it's so obviously better than regular JavaScript you probably don't need more evidence (it's like looking for "evidence" that film stars are more attractive than average people). But anyway we do have great papers like this one.
Anyway that's slightly beside the point. I think the article is right that smart people are not invulnerable to manipulation or falling for "obviously" stupid ideas. I know plenty of very smart religious people for example.
However I think using this to dismiss LLMs is dumb, in the same way that his dismissal of Typescript is. LLMs aren't homeopathy or religion.
I have used LLMs to get some work done and... guess what, it did the work! Do I trust it to do everything? Obviously not. But sometimes I don't need perfect code. For example recently I asked it to create an example SystemVerilog file for me utilising as many syntax features as possible (testing an auto-formatter). It did a pretty good job. Saved some time. What psychological hazard have I fallen for exactly?
Overall, B-. Interesting ideas but flawed logic.
LLMs can’t think - only generate statistically plausible patterns
Ah still rolling out the old “stochastic parrot” nonsense I see.
Ah still rolling out the old "computers think" pseudo-science.
I have used LLMs to get some work done and… guess what, it did the work!
Ah yes the old pointless vague anecdote.
What psychological hazard have I fallen for exactly?
Promoting pseudo-science.
Overall D. Neither interesting nor new nor useful.
Ah yes the old pointless vague anecdote.
If your argument is "LLMs can't do useful work", and then I say "no, I've used them to do useful work many times" how is that a pointless vague anecdote? It's a direct proof that you're wrong.
Promoting pseudo-science.
Sorry what? This is bizarre.
Amen
And to add that smart people fall for dumb biases, we just need to look at the object oriented mania of the 2000s to late 2010s to see us shoehorn in one paradigm into everything without critically considering whether it made sense over other models.
Can an LLM do everything I need yet? No.
But is a stochastic parrot good enough to help me complete a function and help me restructure code? Yes definitely.
Claude is good enough for so much of the low value code I write that is actually a useful tool. I have to review the code but it’s useable.
I use AI search to lookup functions that I don’t need detailed docs for, or to help me debug arcane library specific errors (just had one earlier today where in polars the list and array types are very much not interchangeable and the explode method was failing).
I still read the docs on things that are critical, and I write the critical paths and dictate structure and understand the problem im solving well.
It's really amazing the number of people trying to argue that LLMs are useless, while simultaneously so many people are using them successfully. Makes me wonder if they've even tried them.
Ah still rolling out the old "stochastic parrot" nonsense I see.
It is a bunch of stochastic parrots. It just happens frequently that the words they are parroting were orginally written by a bunch of intelligent people which were knowledgeable in their fields.
Note this doesn't makes the parrots intelligent - in the same way that a book written by Einstein to explain special relativity has any own intelligence. Einstein was intelligent, his words transport his intelligent ideas, but the book conveying them to other people (as, the printed pages with cardboard cover) is as dumb as a stone. You would not ask a piece of cardboard so solve a math problem, would you?
Your comment doesn't account for the fact that LLMs can generalise. Often not very well but they can produce outputs for inputs not seen in their training sets. Otherwise what would be the point?
You would not ask a piece of cardboard so solve a math problem, would you?
Uhhh you know LLMs can solve quite complex maths problems? Including novel ones.
What called my attention is that assessments of AI are becoming polarized and somewhat a matter of belief.
Proceed to write a belief as a statement in the following paragraph
If you think LLMs doesnt think (I won't argue that they arent extremely dumb), please define what is thinking, before continuing, and if your definition of thinking doesn't apply to humans, we won't be able to agree.
The burden of proof is on those who say that LLMs do think.
I asked for your definition, I cannot prove something if we do not agree on a definition first.
You also missread what I said, I did not said AI were thinking.
The burden of proof is on the one who made an affirmation.
I'm not the one who made an affirmation which field experts doesn't know the answer.
But depending of your definition of thinking, some can be answered.
I don't think y'all are disagreeing but maybe this sentence is somewhat confusing:
If you think LLMs doesnt think (I won’t argue that they arent extremely dumb), please define what is thinking,
Maybe the "doesnt" shouldn't be there.
No it is here because that's what they claim.
Nobody yet know how it work, we don't know how LLMs process information.
Anyone who claim it really think, or it isn't thinking, is believing, this is not something the current ML field know.
Well, the neural network is given a prefix (series of tokens) and a token, and it spits out how likely is it that the token follows the prefix. Text is generated by calculating this probability for all known tokens, then picking one random, weighted based on the calculated probabilities.
And the brain is made out of neurons that sends electric signals between them and operate muscles.
That doesnt explain how the brain think.
I don't think the current common implementation of AI systems are "thinking" and I'll base my argument on Oxford's definitions of words. Thinking is defined as "the process of using one's mind to consider or reason about something". I'll ignore the word "mind" and focus on the word "reason". I don't think what AIs are doing counts as reasoning as defined by Oxford. Let's go to that definition: "the power of the mind to think, understand, and form judgments by a process of logic". I take issue with the assertion that they form judgments. For completeness, but I don't think it's definition is particularly relevant here, a judgment is: "the ability to make considered decisions or come to sensible conclusions".
I think when you ask an LLM how many 'r's there are in Strawberry and questions along this line you can see they can't form judgments. These basic but obscure questions are where you see that the ability to form judgements isn't there. I would also add that if you "form judgments" you probably don't need to be reminded you formed a judgment immediately after forming one. Like if I ask an LLM a question, and it provides an answer, I can convince it that it was wrong whether or not I'm making junk up or not. I can tell it it made a mistake and it will blindly change it's answer whether it made a mistake or not. That also doesn't feel like it's able to reason or make judgments.
This is where all the hype falls flat for me. It feels like sometimes it looks like a concrete wall, but occasionally that concrete wall is made of wet paper. You can see how impressive the tool is and how paper thin it is at the same time. It's cool, it's useful, it's fake, and that's ok. Just be aware of what the tool is.
I think when you ask an LLM how many 'r’s there are in Strawberry and questions along this line you can see they can’t form judgments.
Like a LLMs you are making the wrong affirmation based lacking knowledge.
Current LLMs input, and output tokens, they dont ever see the individual letters, they see tokens, for straberry, they see 3 tokens:
They dont have any information on what characters are in this tokens. So they come up with something. If you learned a language only by speaking, you'll be unable to write it down correctly (except purely phonetical systems), instead you'll come up with what you think the word should be written.
I would also add that if you “form judgments” you probably don’t need to be reminded you formed a judgment immediately after forming one.
You come up with the judgment before you are aware of it: https://www.unsw.edu.au/newsroom/news/2019/03/our-brains-reveal-our-choices-before-were-even-aware-of-them--st
can tell it it made a mistake and it will blindly change it’s answer whether it made a mistake or not. That also doesn’t feel like it’s able to reason or make judgments.
That's also how the brain can works, it come up with a plausible explanation after having the result.
See the experience which are spoken about here: https://www.youtube.com/watch?v=wfYbgdo8e-8
I showed the same behavior in humans of some behavior you observed in LLMs, does this means that by your definition, humans doesnt think ?
If the LLM could reason, shouldn't it be able to say "my token training prevents me from understanding the question as asked. I don't know how many 'r's there are in Strawberry, and I don't have a means of finding that answer"? Or at least something similar right? If I asked you what some word in a language you didn't know, you should be able to say "I don't know that word or language". You may be able to give me all sorts of reasons why you don't know it, and that's all fine. But you would be aware that you don't know and would be able to say "I don't know".
If I understand you correctly, you're saying the LLM gets it wrong because it doesn't know or understand that words are built from letters because all it knows are tokens. I'm saying that's fine, but it should be able to reason that it doesn't know the answer, and say that. I assert that it doesn't know that it doesn't know what letters are, because it is incapable of coming to that judgement about its own knowledge and limitations.
Being able to say what you know and what you don't know are critical to being able to solve logic problems. Knowing which information is missing and can be derived from known things, and which cannot be derived is key to problem solving based on reason. I still assert that LLMs cannot reason.
I’m saying that’s fine, but it should be able to reason that it doesn’t know the answer, and say that.
That is of course a big problem. They try to guess too much stuff, but it's also why it kinda works. Symbolics AI have the opposite problem, they are rarely useful, because they can't guess stuff, they are rooted in hard logic, and cannot come up with a reasonable guess.
Now humans also try to guess stuff and sometimes get it wrong, it's required in order to produce results from our thinking and not be stuck in a state where we don't have enough data to do anything, like a symbolic AI.
Now, this is becoming a spectrum, humans are somewhere in the middle of LLMs and symbolics AI.
LLMs are not completely unable to say what they know and doesnt know, they are just extremely bad at it from our POV.
The probleme with "does it think" is that it doesn't give any quantity or quality.
Is the argument that LLMs are thinking because they make guesses when they don't know things combined with no provided quantity or quality to describe thinking?
If so, I would suggest that the word "guessing" is doing a lot of heavy lifting here. The real question would be "is statistics guessing"? I would say guessing and statistics are not the same thing, and Oxford would agree. An LLM just grabs tokens based on training data on what word or token most likely comes next, it will just be using what the statistically most likely next token or word is. I don't think grabbing the highest likely next token counts as guessing. That feels very algorithmic and statistical to me. It is also possible I'm missing the argument still.
Is the argument that LLMs are thinking because they make guesses
No, it's that you can't root the argument that they don't think over the fact they make stuff up, because humans too. You could root it in the amount of things it guess wrong, but it's extremely hard to measure.
Again, I'm not claiming that they think, but that we don't know until one or the other is proven.
Right now, thinking one, or the other is true, is belief.
Since LLMs runs on CPUs with a lot of memory, do you agree that my calculator is thinking?
This argument makes no more sense than trying to say that a plant is thinking because brains are made of cells and so are plants.
You think computation is thinking ?
I asked for your definition of thinking.
The OP talked about belief, then made a statement using a word that is not precisely defined.
If you think computation is thinking then by your definition the LLM is thinking.
But that's your definition of thinking.
' Please succinctly answer a question of philosophy that has plagued mankind for thousands of years. can't? <crosses arms with a superior smirk> I win'
Claiming LLMs can't think with the current informations available, and calling that not a belief, is claiming to have a response to this philosophy question.
The only sensible answer is saying you don't know, or being aware and communicating that your statement is a belief.
What called my attention is that assessments of AI are becoming polarized and somewhat a matter of belief.
Some people firmly believe LLMs are helpful. But tasks lile programming are logical tasks and LLMs absolutely can't think - only generate statistically plausible patterns.
The author of the article explains that this creates the same psychological hazards like astrology or tarot cards, psychological traps that have been exploited by psychics for centuries - and even very intelligent people can fall prey to these.
Finally what should cause alarm is that on top that LLMs can't think, but people behave as if they do, there is no objective scientifically sound examination whether AI models can create any working software faster. Given that there are multi-billion dollar investments, and there was more than enough time to carry through controlled experiments, this should raise loud alarm bells.
Here's a big important test you can use to see if something is actually useful and effective.
Pick a random tool and ask all your friends and neighbors what they last used it for. "Hey bob, I was wondering, what was the last thing you used your belt sander/hammer/paintbrush for?". You'll probably get a very accurate answer about something that needed doing. "Oh, I had to sand down the windowsill because the paint was cracked" or "I tightened the screws on my coffeetable"
Now do the same for AI.
The big problem with asking if AI is useful is that people suck at figuring out how to do someone else's work, but they've got a pretty good idea what their own work is like. As a result, it's very easy to think that AI can do someone else's job, but for YOUR job, that you actually understand, you can easily see what bullshit AI spouts and how it misses all the important bits.
Sure, if your idea is that "Programmers write code", then yeah, AI can do that. Similarly, "authors write stories" is true, and AI can write stories. But if you know very slightly more, you realize that programmers only write code like 10% of the time, and authors probably write words less than 10% of the time. The job is about structuring and planning and laying out, the typing is just the final details.
But if you understand fuckall about a job, then yeah AI can definitely generate stuff that looks like other stuff, because it's a machine specifically designed to make stuff that looks like other stuff.
Is actually have a good answer but in my case it'd be "I wanted to know what that one plant I saw was". AI-based pattern matching to identify plant or animal species is pretty handy.
It's also way more sensible than trying to use text generation for anything useful.
Fair, I was mostly talking about LLMs and other generative AI.
This is a mischaracterization of how AI is used for coding and how it can lead to job loss. The use case is not "have the AI develop apps entirely on its own" it's "allow one programmer to do the work of 3 programmers by using AI to write or review portions of code" and "allow people with technical knowledge who are not skilled programmers to write code that's good enough without the need for dedicated programmers." Some companies are trying to do the first one, but almost everyone is doing the second one, and it actually works. That's how AI leads to job loss. A team of 3 programmers can do what used to take a team of 10 or so on.