Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P parabix-devel
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • sla489
  • parabix-devel
  • Merge requests
  • !6

Merged
Created Jul 08, 2024 by sla489@sla489Maintainer

Implement `simd_popcount` using NEON `cnt`.

  • Overview 0
  • Commits 2
  • Pipelines 2
  • Changes 2

The cnt instruction gives a population count for bytes in a vector, i.e. for <8 x i8> or <16 x i8>. This implmentation counts set bits in the bytes of a vector and uses NEON's addp (pairwise add) to reduce the vector to the appropriate field width and element count.

Performance gains from this change are pretty good. The popcount kernel shows a reduction of ~10% in cycles used on average.

Assignee
Assign to
Reviewer
Request review from
None
Milestone
None
Assign milestone
Time tracking
Source branch: popcount