AI partitioning stops increase of threat surface, GitHub execs claim

“We don’t train the model on the customer data that is coming in,” one exec tells IT Brew.

article cover — Sopa Images/Getty Images

November 30, 2023

• 3 min read

Eoin Higgins is a reporter for IT Brew whose work focuses on the AI sector and IT operations and strategy.

You gotta keep ’em separated.

The advent of AI-powered software prompt engineering has enormous potential, but with it comes the risk of expanding threat surfaces and exposing proprietary code. Earlier this year, Samsung notably experienced a leak of data earlier and on May 1 ordered employees to stop using outside generative AI tech on company-owned devices, Bloomberg reported.

At GitHub, meanwhile, executives say their prompt engineering software, Copilot, can write working code from abstract, natural-language prompts without customers exposing internal data.

Executives at the GitHub Universe conference in early November told IT Brew that their AI can’t learn from prompts, removing the vulnerability from the tool.

“It doesn’t hold anything in it that we’re allowed to continuously [learn from],” Mario Rodriguez, VP of product at GitHub, told IT Brew. “We don’t train the model on the customer data that is coming in.”

In practice, that means the AI you get out of the box is the AI you work with. Copilot won’t independently learn from your code, though Rodriguez said that models above the base model may offer that option in the future.

Threat detected. The way AI is being utilized may expose internal information to possible attackers, IT Brew reported in May. Proprietary data leaking into generative AI models can lead to severe security risks, Cyberhaven CEO Howard Ting told IT Brew at the time.

“If I wanted to find out something about you, and I’m launching a targeted attack against you, I can go ask OpenAI questions about what it knows about you,” Ting said. “Some of those data elements might come back to me in the response, because they’re using all this data that’s been uploaded to train their model.”

More to say. AI will also be utilized to detect vulnerabilities in code, GitHub CSO Mike Hanley said. Hanley told us that around 30 million fixes have been made with code-scanning technology, prior to the addition of AI. Looking ahead, the new AI code-scanning technology, combined with the dependency vulnerability patches, will have “a huge impact, as developers will be able to address more issues, with more accuracy and at a faster pace than before,” Hanley said.

“Then you add the massive capabilities that we get from AI on top of that, to automatically suggest those fixes, you’re really putting a superhero cape on your average developer that helps make them better at security without being a security expert,” Hanley said.

Top insights for IT pros

Top insights for IT pros