Software Developer - Montreal Quantitative and Computational Linguistics Lab

Application Period: May 9, 2024 - May 17, 2024

We are no longer accepting applications for this position.

Nous n'acceptons plus les candidatures pour ce poste.

Contact: internshipofficer.cs@mcgill.ca


Research position for McGill students. If you wish to apply, please send your application (CV/Cover letter) to internshipofficer.cs@mcgill.ca by May 17 with the position title in the subject.

Details and information

PolyglotDB is software for representing and querying speech databases developed in the Montreal Computational and Quantitative Linguistics Lab (first additional link and second additional link). PolyglotDB makes use of a combination of Neo4J for representing the structure of time-stamped and annotated speech data, InfluxDB for representing time-series data (such as pitch tracks), SQL for relational metadata, and a custom meta-language for making queries which are parsed into the underlying query languages (Cypher, InfluxQL, SQL). PolyglotDB is used for research projects involving processing large-scale speech data, primarily SPADE.

We are looking for a software developer for Polyglot, whose development has been frozen since 2020. You will need to do at least (1) technical maintenance : updating the system to work with current versions of dependencies (especially Neo4j 4.0 -> 5.0, OpenJDK 11 -> 17), on currently supported architectures (Mac/Windows/Linux) and streamlining installation; and ideally also (2) new development : implementing support for new architectures (e.g. newer Mac chips) and new data types, integration with other speech analysis or statistical software (e.g. R, Praat), implementing and documenting new use cases. There is some critical work to be done on (1) to keep the software working for current users, but otherwise there is flexibility in what the job entails, that can be tailored to your interests and how much you want to focus on software development vs. learning about the application area (speech data science).

Required knowledge and skills: (Essential) Experience with software development in Python, databases (NoSQL, ideally Neo4j); (Useful) experience in test-driven development, Java, speech technology, linguistics.

Start and end date of research project: As soon as possible - August 2024

Remuneration details: TBD