It's powered by the GPT-4 model family and is planned to integrate into ChatGPT at some point in the future. SearchGPT is OpenAI’s long-awaited answer to...
No, its fancy autocomplete at a huge scale. Sometimes it returns correct answers.
A search engine should be taking a list of websites and metadata about those websites and returning results based on some ranking with the original desire being to get you what you wanted. (The current desire is just how much money can be extracted from your hands on the keys)
No. ChatGPT pulls information out of its ass and how I read it SearchGPT actually links to sources (while also summarizing it and pulling information out of it’s ass, presumably). ChatGPT “knows” things and SearchGPT should actually look stuff up and present it to you.
Kagi supports this since a while. You can end your query with a question mark to request a “quick answer” generated using an llm, complete with sources and citations. It’s surprisingly accurate and useful!
This is like saying the library search engine and Bob the drunkard who looked at the shelf labels and swears up and down he knows where everything is are the same thing.
Look, ChatGPT is an averaging machine. Yes it has ingested a significant chunk of the text on the internet, but it does not reproduce text exactly as it found it, it produces an average of all the text it has seen, weighted towards what seems like it make sense for the situation. For really common information this is fine. For niche information, it is bullshitting without any indication.
This is like saying the library search engine and Bob the drunkard who looked at the shelf labels and swears up and down he knows where everything is are the same thing.
It’s…not remotely the same thing?
It’s like saying an engine that searches the web for answers to your query is a search engine…?
but it does not reproduce text exactly as it found it
ChatGPT is not a search engine, it generates predictions on what is the most likely text completion to your prompt. It does not pull information from a database. It is a mathematical model. Its weights do not contain the training data. It is not indexing anything. You will not find any page from the internet in the model. It is all averaged out and any niche detail is lost, overpowered by more prevalent but less relevant training data. This is why it bullshits. When it bullshits it is not because it searched for something and came up empty, it is because in the training data there simply was not a sufficient number of occurrences of the answer to influence its response against the weight of all the other more prevalent training data. ChatGPT does not search anything.
It is every bit as much of a search engine as SearchGPT, with the exception of more recent information, as I’ve already explained.
it generates predictions on what is the most likely text completion to your prompt.
…using information from the internet. I’m honestly baffled this needs to be explained. Once again, I ask: Where do you think the information it generates comes from? It’s not just word salad, the words contain information. Were you unaware of the many many OpenAI lawsuits based on this fact?
This is why it bullshits.
It bullshits because it’s trained on bullshit, and doesn’t actually know anything, and isn’t programmed to say “I don’t know”.
From the train dataset that was frozen many years ago. It’s like you know something instead of looking it up. It doesn’t provide sources, it just makes shit up based on what was in the (old) dataset. That’s totally different than looking up the information based on what you know and then using the new information to create an informed answer backed up by sources
I don’t understand. Isn’t CGPT already just a fancy search engine?
No, its fancy autocomplete at a huge scale. Sometimes it returns correct answers.
A search engine should be taking a list of websites and metadata about those websites and returning results based on some ranking with the original desire being to get you what you wanted. (The current desire is just how much money can be extracted from your hands on the keys)
No. ChatGPT pulls information out of its ass and how I read it SearchGPT actually links to sources (while also summarizing it and pulling information out of it’s ass, presumably). ChatGPT “knows” things and SearchGPT should actually look stuff up and present it to you.
Kagi supports this since a while. You can end your query with a question mark to request a “quick answer” generated using an llm, complete with sources and citations. It’s surprisingly accurate and useful!
…where do you think CGPT gets the information it “knows” from?
It’s not doing live queries at all, it just makes a statistically likely answer up from its training data
Training data from where…?
I mean yeah it does include data scraped from the web but that is all three years old at this point. Hardly a search engine by any metric
So, in your mind, a “search engine” isn’t an engine that searches the web?
It literally doesn’t do that
It literally does…
You just said so yourself in the comment I replied to.
This is like saying the library search engine and Bob the drunkard who looked at the shelf labels and swears up and down he knows where everything is are the same thing.
Look, ChatGPT is an averaging machine. Yes it has ingested a significant chunk of the text on the internet, but it does not reproduce text exactly as it found it, it produces an average of all the text it has seen, weighted towards what seems like it make sense for the situation. For really common information this is fine. For niche information, it is bullshitting without any indication.
It’s…not remotely the same thing?
It’s like saying an engine that searches the web for answers to your query is a search engine…?
Nor does SearchGPT.
ChatGPT is not a search engine, it generates predictions on what is the most likely text completion to your prompt. It does not pull information from a database. It is a mathematical model. Its weights do not contain the training data. It is not indexing anything. You will not find any page from the internet in the model. It is all averaged out and any niche detail is lost, overpowered by more prevalent but less relevant training data. This is why it bullshits. When it bullshits it is not because it searched for something and came up empty, it is because in the training data there simply was not a sufficient number of occurrences of the answer to influence its response against the weight of all the other more prevalent training data. ChatGPT does not search anything.
It is every bit as much of a search engine as SearchGPT, with the exception of more recent information, as I’ve already explained.
…using information from the internet. I’m honestly baffled this needs to be explained. Once again, I ask: Where do you think the information it generates comes from? It’s not just word salad, the words contain information. Were you unaware of the many many OpenAI lawsuits based on this fact?
It bullshits because it’s trained on bullshit, and doesn’t actually know anything, and isn’t programmed to say “I don’t know”.
From the train dataset that was frozen many years ago. It’s like you know something instead of looking it up. It doesn’t provide sources, it just makes shit up based on what was in the (old) dataset. That’s totally different than looking up the information based on what you know and then using the new information to create an informed answer backed up by sources
It is but its not updated in real-time unlike searchgpt