Update

Troy Hunt and I have worked together to allow people to check if their data was exposed in this breach on Have I Been Pwned

Background

On March 31, the University of California sent out a notice that it was targeted in a nationwide cyber attack. This attack exploited multiple vulnerabilities in Accellion FTA, an aging secure file transfer software ironically touted by its manufacturer for fostering a breach-resistant environment. While UC has provided identity protection (for one year) to its stakeholders, it has not, to date, specified who exactly is affected or notified individuals of their data being exposed. As the data is already being sold and even offered publicly, understanding the impact of the breach and the extent of the data exposure is of paramount importance.

The Data

As UC mentions on their FAQ, "because much of the data is unstructured, and because of the volume of files, this is a labor-intensive and time-consuming process that involves hundreds of hours of detailed review and analysis..." Indeed, working with this data was not fun in the least. Though this analysis is undoubtedly limited and likely undercounting, it still paints a frightening picture.

Social Security Numbers

The most sensitive data in any breach is usually an SSN. After many tortured hours of sifting through obscure file types and agonizingly determining how to programmatically search them somewhat thoroughly, a satisfactory algorithm was devised and sent off to prowl the data.

Searching the data for sensitive info.
Combining the processed data from all the files.
Some of the cleaned exposed data.

The result: 412,738 social security numbers. Certainly not a trivial amount. These numbers are usually found among other bits of personally identifiable information, like name and date of birth. There are all sorts of documents in the data, ranging from financial to educational. Speaking of, what exactly is the breach comprised of?

The result: 412,738 social security numbers.

Types of Documents

UC Application Data

More than 215,000 UC application responses were exposed, appearing to be for Fall 2020. While there is no name included, everything one expects from a college application is, including applicant ID, email, birth date, educational history, address, admittance decisions, etc.

In addition, 213,000 names and emails were exposed from the Fall 2021 financial aid list.

Undergraduate Experience Survey

The UCUES is a survey used by the UC campuses to gauge important student metrics. Questions can often include sensitive information about income and family status, as well as detailed demographic information. A list of UCB's questions can be found here. Over 75,000 responses are in the data, complete with name, address, student ID, etc. and the responses to every survey question.

Early Academic Outreach Program

The EAOP is a system-wide program that seeks to encourage underserved groups to pursue higher education. Information about 19,000 participants in this program and similar ones at all UC campuses was exposed, including name, date of birth, address, high school information, etc. A significant portion of the program's participants are minors.

Transfers

Finally, data for more than 4,700 transfer students was exposed from certain campuses. The data is similar to the EAOP data.

Gathering the exposed transfer data.

Conclusion

This data breach is clearly quite substantial. If you think you may be affected, you can find UC's recomendations here. We are reminded of the ever-pertinent importance of ensuring vendor security and being wary of legacy applications.