GE-13: Anonymised Voter Roll
Anonymised individual-level voter roll detailing the 13,268,002 voters eligible to vote in the 13th General Election (2013).
Summary of all 13,268,002 records
Raw data preview
Columns
| Name | Title | Description |
|---|---|---|
| uid | Voter UID | [String] Anonymised unique identifier for the voter; this UID is consistent acros |
| birth_year | Birth Year | [Integer] Voter's year of birth |
| sex | Sex | [String] Sex of voter; 'Male' or 'Female' |
| ethnicity | Ethnicity | [String] Ethnicity of voter (e.g. 'Malay', 'Chinese', etc.) |
| state | State | [String] State of the voter's registered constituency |
| parlimen | Parliamentary Seat | [String] Code (P.xxX) and name of the parliamentary seat where the voter is registered |
| dun | State Seat | [String] Code (N.xxX) and name of the state seat where the voter is registered |
| dm_vr | Polling District (Voter Roll) | [String] Code (xxx/yy/zz) and name of the voter's polling district, as listed in the voter roll |
| dm | Polling District (Election) | [String] Code (xxx/yy/zz) and name of the polling district (Daerah Mengundi), as per how they were assigned for operational purposes. In particular, this column is what identifies early voters (Undi Awal). |
| pm | Polling Centre | [String] Name of the polling centre (Pusat Mengundi) where the voter is assigned to vote |
| saluran | Saluran (Queue) | [Integer] Ballot channel/stream number the voter is assigned to within their polling station |
Methodology
This dataset was compiled from official voter rolls (Daftar Pemilih Pilihan Raya, DPPR) released by the Election Commission of Malaysia (EC). Since the EC does not publish voter rolls openly, the data had to be acquired from various stakeholders with access to the data. The most critical element of processing for publication on ElectionData.MY is the measures put in place to protect individual privacy - all directly identifying information (name, IC number, address) was removed and replaced with an anonymised unique identifier (UID), while retaining demographic attributes and polling assignment details. Furthermore, any overly identifying attribute was coarsened - specifically, the voter's date of birth was coarsened to a their birth year, and information on locality was omitted entirely (i.e. voting district is the highest resolution provided). Users should also note that the EC does not typically include information on voters' ethnicity in the voter rolls that they sell or share, although they have access to this information in their internal systems. Therefore, within this dataset, information on ethnicity was derived either from deterministic matching against various sources of administrative data, or inferred from voters' names using machine learning techniques. To validate this process, the dataset was compared against the seat-level statistics published by the EC, which do contain data on ethnic breakdowns - the ethnic proportions derived from this dataset match the EC's within a margin of 0.1% across all constituencies. An academic paper (preprint) documenting the full technical details on how this dataset was built and validated will be available by mid-July 2026.