Data Collection

The European Repository of Cyber Incidents (EuRepoC) gathers, codes, and analyses publicly available information from over 200 sources and 600 Twitter accounts on a daily basis to report on dynamic trends in the global, and particularly the European, cyber threat environment.

Standardised Coding

Scientific coding of cyber incidents based on rigid coding procedures


Based on a standardised codebook, experts in IT forensics, political, and legal attribution assess the data on global cyber incidents according to a list of 60 criteria. The coding is based on rigid coding procedures, which require the regular evaluation by internal (second) and external reviewers, and is publicly available to users via the comment function of the dataset. Scientific coding of cyber incidents is an iterative process, necessitating constant refinement and recalibration, e.g., of intensity scores and attribution evidence. The new data is constantly transmitted into the dataset, which feeds the dashboard on cyber incidents. Standardising thus makes targeted comparisons of incidents and incident type possible over time.

EuRepoC Codebook 1.0 (PDF)

Interdisciplinary Perspective

Cyber incidents have ceased to be a purely technical issue – our interdisciplinarity brings a necessary perspective


As cyber incidents have long ceased to be a purely technical issue, our interdisciplinary coding scheme provides the necessary multidimensional perspective, facilitating much-needed, cross-domain discourse on this increasingly volatile threat environment.

The EuRepoC dataset aggregates the results of three coding processes:



  • the political coding by an expert focusing on political attribution and response patterns, as well as entanglements with offline conflicts;


  • and last, but not least, the legal coding by a legal expert who gauges the legal impact of cyber incidents, the legal justification of different response options, as well as openly available evidence as the legal basis for sanctions.

Data Selection

The EuRepoC dataset focuses on cyber incidents that are publicly reported, thus leaving out a potentially great number of unreported cases due to non-detection or nondisclosure. Moreover, it includes cyber incidents only if they a) have affected political or state actors/institutions, b) have been associated with state-actors as the actual “masterminds,” or c) have been “publicly politicized, regardless of the affected target” (Steiger et al. 2018). Thereby, cyber incidents are consciously cut out (e.g., many ransomware attacks) that concern specific stakeholders but are not addressed particularly by political actors.

Cyber Intensity Scale

Part of the dataset is also a cyber intensity scale that assesses the coded incident types, their potential physical effects, and their socio-political severity. These values are added to a total (weighted) cyber intensity score, ranging from 1-15.

How We Work

The dynamic evolution and complexity of an instrument such as the EuRepoC dataset must undergo continuous specification and improvement. 


We are committed to the regular testing and adjustment of our methodology and coding process, without risking the comparability of our data over time. 


However, if coding decisions have to be refined in a way that impacts the long-term data comparability, this will be communicated in a thorough and transparent way in order to ensure that our data consumers will be aware of these changes. Since we strive for a coherent data-driven approach in order to foster quantitative, empirical cyber conflict research, but at the same time integrating these insights into our more conceptual, qualitative work, a constant and self-reflective evaluation of our data-collection and processing is mandatory. In doing so, we often benefit from external feedback, advice, and encouragement of users and our regular reviewers.