Shocked by Copilot / Chat-GPT4 response

OpticsEngineer

Veteran Member
Messages
8,890
Solutions
29
Reaction score
6,092
Location
Albuquerque, US
So I am using Copilot to help me do my coding tasks. Asking it for things like example code to read from a serial port. Sometimes the answers are short and kind of generic. Sometimes the answers long and more helpful. So I asked Copilot why the difference. And it tells me it gives better answers if my IDE (integrated development environment) is open to the code I am working on so it gets the context of what I am doing. It never occurred to me Copilot was getting things off my computer like that.

I guess I should have been suspicious quicker. Sometimes example code I got back looked correct but kind of weird. But other times, it seemed exactly like what I would write. I was wondering why the styles fluctuated so randomly but now I know why.

Well, I guess I learn something everyday, but Copilot is learning off me at the same time. I wonder what Microsoft does with that. Does anyone know if Microsoft is going any further with stuff off our computers, or is it not part of what gets all the way back into Chat-GPT databases. It kind of has implications for our employers.
 
Last edited:
To quote Microsoft:
No. GitHub uses neither Copilot Business nor Enterprise data to train the GitHub model.
So if your employees are given a Business/Enterprise license for Copilot, the employer's data is safe.

More here:

 
Last edited:
To quote Microsoft:
No. GitHub uses neither Copilot Business nor Enterprise data to train the GitHub model.
So if your employees are given a Business/Enterprise license for Copilot, the employer's data is safe.
Hmm, semantics, semantics?

There is a difference between 'training' and inferring. It may be that Copilot does not use your stuff to train the model but it uses it as an input to it when it needs to come up with an answer.

Jack

PS. Does it use its own answers to train future versions of the model? ;-)
 
Last edited:
I think most would be shocked at the telemetry built into commercial applications today. And unlike the choice you were presented, most are opt-out rather than in.
 
Last edited:
To quote Microsoft:
No. GitHub uses neither Copilot Business nor Enterprise data to train the GitHub model.
So if your employees are given a Business/Enterprise license for Copilot, the employer's data is safe.
Hmm, semantics, semantics?

There is a difference between 'training' and inferring. It may be that Copilot does not use your stuff to train the model but it uses it as an input to it when it needs to come up with an answer.

Jack

PS. Does it use its own answers to train future versions of the model? ;-)
A recurring theme these days, often without a smiley.

There's a long-running anti-AI thread over on pixls.us featuring such comments:

 
Thanks for the link. It brought up things I had not thought about yet.
 
So I am using Copilot to help me do my coding tasks. Asking it for things like example code to read from a serial port. Sometimes the answers are short and kind of generic. Sometimes the answers long and more helpful. So I asked Copilot why the difference. And it tells me it gives better answers if my IDE (integrated development environment) is open to the code I am working on so it gets the context of what I am doing. It never occurred to me Copilot was getting things off my computer like that.

I guess I should have been suspicious quicker. Sometimes example code I got back looked correct but kind of weird. But other times, it seemed exactly like what I would write. I was wondering why the styles fluctuated so randomly but now I know why.

Well, I guess I learn something everyday, but Copilot is learning off me at the same time. I wonder what Microsoft does with that. Does anyone know if Microsoft is going any further with stuff off our computers, or is it not part of what gets all the way back into Chat-GPT databases. It kind of has implications for our employers.

LLMs’ Data-Control Path Insecurity
 
To quote Microsoft:
No. GitHub uses neither Copilot Business nor Enterprise data to train the GitHub model.
So if your employees are given a Business/Enterprise license for Copilot, the employer's data is safe.
Hmm, semantics, semantics?

There is a difference between 'training' and inferring. It may be that Copilot does not use your stuff to train the model but it uses it as an input to it when it needs to come up with an answer.

Jack

PS. Does it use its own answers to train future versions of the model? ;-)
A recurring theme these days, often without a smiley.

There's a long-running anti-AI thread over on pixls.us featuring such comments:

https://discuss.pixls.us/t/ai-the-ensh-tification-of-the-web/42590/182
Long ago, before AI was a thing, there was testing to “gold” standard. But how do you determine that standard? One speaker manufacturer noticed increasing quality variations in its speaker performance over time, even though testing was conducted against a standard. The problem was eventually solved when someone observed the process: the gold standard came from the previous production run, not a fixed standard. So the variations in production would tend to reinforce over time.



it strikes me that using previous answers as training for the next batch is much the same wrong procedure.
 
To quote Microsoft:
No. GitHub uses neither Copilot Business nor Enterprise data to train the GitHub model.
So if your employees are given a Business/Enterprise license for Copilot, the employer's data is safe.
Hmm, semantics, semantics?

There is a difference between 'training' and inferring. It may be that Copilot does not use your stuff to train the model but it uses it as an input to it when it needs to come up with an answer.

Jack

PS. Does it use its own answers to train future versions of the model? ;-)
A recurring theme these days, often without a smiley.

There's a long-running anti-AI thread over on pixls.us featuring such comments:

https://discuss.pixls.us/t/ai-the-ensh-tification-of-the-web/42590/182
Long ago, before AI was a thing, there was testing to “gold” standard. But how do you determine that standard? One speaker manufacturer noticed increasing quality variations in its speaker performance over time, even though testing was conducted against a standard. The problem was eventually solved when someone observed the process: the gold standard came from the previous production run, not a fixed standard. So the variations in production would tend to reinforce over time.
That reminds me of a Ministry of Defence advertising campaign back in the sixties when I was in the RAF - where "Fred the wheel tapper" changed over 100,000 wheels on railroad rolling stock ... before they found that his hammer-head was cracked.
 
" increasing quality variations .... eventually solved when someone observed the process: the gold standard came from the previous production run."

I ran into a similar thing a few years ago when an R&D engineer fiddled with a production line computer and changed a system path variable setting, so reference data was being pulled from the previous instrument instead of the reference instrument. A process that had been working fine for over five years began drifting to make things that failed or barely passed final acceptance tests. It is really hard to train R&D people not to do stuff like that since they often just can't seem to understand the hundreds or thousands of wasted man-hours they cause, when other people have to deal with unhappy customers, finding root causes, fixing problems, and in the case of medical instruments, complying with government documentation and reporting requirements.
 
Last edited:
So I am using Copilot to help me do my coding tasks. Asking it for things like example code to read from a serial port. Sometimes the answers are short and kind of generic. Sometimes the answers long and more helpful. So I asked Copilot why the difference.
The difference is because it cant think. It just spews out what some human somewhere had coded + some "averaging" that resulted from the training.

And it tells me it gives better answers if my IDE (integrated development environment) is open to the code I am working on so it gets the context of what I am doing. It never occurred to me Copilot was getting things off my computer like that.

I guess I should have been suspicious quicker. Sometimes example code I got back looked correct but kind of weird. But other times, it seemed exactly like what I would write. I was wondering why the styles fluctuated so randomly but now I know why.

Well, I guess I learn something everyday, but Copilot is learning off me at the same time. I wonder what Microsoft does with that. Does anyone know if Microsoft is going any further with stuff off our computers, or is it not part of what gets all the way back into Chat-GPT databases. It kind of has implications for our employers.
 

Keyboard shortcuts

Back
Top