PolitikMYPolitikMY

Negeri Sembilan SE-15: Anonymised Voter Roll

Anonymised individual-level voter roll detailing the 864,425 voters eligible to vote in the 15th N9 State Election (2023).

State Elections (DPPR)864,425 rows11 columnsData as of 2023-08-12Updated 2026-06-23
This dataset has 864,425 rows — too large to render in full. The table below previews the first 500 rows; aggregate summaries of the complete dataset are shown below.

Summary of all 864,425 records

Loading summary…

Raw data preview

Loading data…

Columns

NameTitleDescription
uidVoter UID[String] Anonymised unique identifier for the voter; this UID is consistent acros
birth_yearBirth Year[Integer] Voter's year of birth
sexSex[String] Sex of voter; 'Male' or 'Female'
ethnicityEthnicity[String] Ethnicity of voter (e.g. 'Malay', 'Chinese', etc.)
stateState[String] State of the voter's registered constituency
parlimenParliamentary Seat[String] Code (P.xxX) and name of the parliamentary seat where the voter is registered
dunState Seat[String] Code (N.xxX) and name of the state seat where the voter is registered
dm_vrPolling District (Voter Roll)[String] Code (xxx/yy/zz) and name of the voter's polling district, as listed in the voter roll
dmPolling District (Election)[String] Code (xxx/yy/zz) and name of the polling district (Daerah Mengundi), as per how they were assigned for operational purposes. In particular, this column is what identifies early voters (Undi Awal).
pmPolling Centre[String] Name of the polling centre (Pusat Mengundi) where the voter is assigned to vote
saluranSaluran (Queue)[Integer] Ballot channel/stream number the voter is assigned to within their polling station

Methodology

This dataset was compiled from official voter rolls (Daftar Pemilih Pilihan Raya, DPPR) released by the Election Commission of Malaysia (EC). Since the EC does not publish voter rolls openly, the data had to be acquired from various stakeholders with access to the data. The most critical element of processing for publication on ElectionData.MY is the measures put in place to protect individual privacy - all directly identifying information (name, IC number, address) was removed and replaced with an anonymised unique identifier (UID), while retaining demographic attributes and polling assignment details. Furthermore, any overly identifying attribute was coarsened - specifically, the voter's date of birth was coarsened to a their birth year, and information on locality was omitted entirely (i.e. voting district is the highest resolution provided). Users should also note that the EC does not typically include information on voters' ethnicity in the voter rolls that they sell or share, although they have access to this information in their internal systems. Therefore, within this dataset, information on ethnicity was derived either from deterministic matching against various sources of administrative data, or inferred from voters' names using machine learning techniques. To validate this process, the dataset was compared against the seat-level statistics published by the EC, which do contain data on ethnic breakdowns - the ethnic proportions derived from this dataset match the EC's within a margin of 0.1% across all constituencies. An academic paper (preprint) documenting the full technical details on how this dataset was built and validated will be available by mid-July 2026.