Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P parabix-devel
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 9
    • Issues 9
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • cameron
  • parabix-devel
  • Issues
  • #15

Closed
Open
Created Feb 25, 2020 by cameron@cameronMaintainer

Accelerate FileSelect processing

Often a command needs to be applied to a collection of files or directories. These may include files explictly named on the command line or files implicitly required based on a recursive directory traversal (typically indicated by the "-r" command line flag). The problem of determining which files to process is called fileselect processing.

File select processing may use patterns to identify files that should be included or excluded. These patterns may also be specified on the command line or may be read from a .gitignore file.

lib/fileselect/fileselect.cpp contains an implementation which handles standard include and exclude patterns from command line files and .gitignore files. However, it is too slow when applied to large source code collections such as the Linux source code. The principal problem is the cost of compiling .gitignore files, there may be hundreds of these files in a large source code base. Rather than using Parabix regular expression processing (which has very fast matching at the expense of high compile time), investigate whether other regular expression (or git glob) libraries could be used to improve performance.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking