2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul

Park Lane Recruitment ·www.plr.ltd

Location The Hague, The Hague, Netherlands, Netherlands
Type Full time
Level Mid
Source Shazamme
NATO
Apply direct
Deadline Date: Wednesday 08 July 2026
 
Requirement: Big Data and AI Technology for Raw Data to Searchable Archives - Data Processing Pipeline
 
Location: On-site at NATO Communications and Information Agency, The Hague, The Netherlands
 
Cost Not to Exceed: EUR 88,290
 
Period of Performance: 2026-27: 12 August 2026 - 28 February 2027
 
Required Security Clearance: NATO SECRET
 
Please do NOT apply for any NATO contract positions unless you meet ALL the following criteria:
  1. Current National or NATO SECRET clearance
  2. Nationality of one of the NATO member countries
  3. Current work visa for the specific location if applying for an in-country position
Any applications that do NOT meet all the above - and do not CLEARLY show these on the CV - will be deleted.
 
Introduction:
  • The NATO Communications and Information Agency (NCIA) located in The Hague, Netherlands, is currently involved in processing vast amounts of highly variant data coming from theatre for the purpose of efficient archiving.
  • Within NCIA Chief Technology Office, the Exploiting Data Science and Artificial Intelligence (EDS&AI) team is tasked to apply Big Data and AI technology to prepare, run and adjust processing pipelines for processing various source data into archiving formats and metadata, and prepare for semantic search.
  • NATO has an obligation to support national investigations into situations that occurred in theatre. In order to support the different teams involved most optimally, the EDS&AI team brings the expertise to extract and exploit the vast and varied data on the table, by using the Agency's high performance computing classified sandbox.
  • The EDS&AI team provides the core data science skills and technology needed for big data analysis and AI, and applies innovative technology to data whenever it is not possible to extract value with conventional approaches.
Objective
 
This Statement of Work describes the work necessary to provide specific AI and Data Exploitation activities for processing raw data from theatre to searchable archives. The services will be provided to the NCIA CTO/EDS&AI team, as they deliver specialised Data Science and AI results to their stakeholders in NATO Headquarters and NATO Allied Command Operations.
 
Overarching objectives:
  • Make required documents from theatre accessible and searchable by archivists during execution
  • Capture document contents into long term preservation formats
  • Capture Functional Area System (FAS; back-up) contents into long term preservation formats
  • Identify (and remove) duplicate documents, records of temporary value and non-records that are not required for archiving
  • Provide interim and final data reports describing actions and results
This task is structured as a deliverable-based engagement and not as level-of-effort support.
 
Scope of Work:
  • Under the direction of CTO-EDS&AI, the Contractor shall design, build, adapt, execute and maintain data processing pipelines within the NCIA classified sandbox environment.
  • Setting up and improving pipelines to process all required documents that uniquely identify and trace decisions and processing steps. This is to be conducted on the provided classified sandbox environment, with provided performance hardware and toolsets.
  • Implementing and improving pipeline steps for marking duplicate files, based on file attributes, path structure and content similarity, and rules for considering a file or structure a duplicate.
  • Extracting document-format records from Functional Area Systems (FAS) databases and back-ups. Archiving SMEs and system SMEs are available for guidance on target formats and source system structure and data interpretation. Each FAS is processed separately.
  • Processing and monitoring progress of various office, image and video file types to accepted archiving formats, including extraction of metadata and preparing semantic search indexes.
  • Automating the registration of all processed documents with semantic indexes using the sandbox natural language search tool.
  • Automating the final copy of all non-duplicate and extracted archive documents with content and metadata to the NATO archiving system.
  • Reporting status, progress and statistics of the raw files being processed to archive formats, metadata and search indexes.
  • Delivering full reporting of results, trace of pipeline steps taken and stakeholder-accepted failures. Quarterly updates.
  • In general, most items will translate to a build (new pipeline or processing step), execute (reported progress on data batches), improve (optimized or corrected pipeline or processing step) or monitor (check on logs and progressing statistics) activity. Orchestrating pipelines are expected to utilize KNIME. Reporting efforts are expected to target Microsoft Power BI dashboards. GitLab is expected to be used for source code management and documentation.
Coordination and Reporting:
  • The Contractor shall provide services on-site at NATO Communications and Information Agency in The Hague. The Contractor shall coordinate with, and report to, the NCIA EDS&AI team in The Hague.
  • The Contractor shall be given access to necessary NATO IT systems, and shall comply with all necessary policies and procedures.
  • The Contractor shall participate in regular status update meetings and other meetings, physically in the office or in person via electronic means using Conference Call capabilities, according to their manager's instructions.
  • For each Processing Unit to be considered complete and payable, the contractor must report the outcome of their work, first verbally during the retrospective meeting and then in writing within three (3) days after the deliverable production ended. A report in the format of a short email shall be sent to the nominated point of contact of the NCI Agency, mentioning briefly the work held and the development achievements.
  • Knowledge transfer activities may include provision of operational documentation, pipeline overview briefings and lessons learned. Detailed arrangements shall be agreed during execution.
Requirements
Specific Requirements:
  • At least 3 years of practical experience in the field of data science and/or data analytics.
  • Experience using data processing, visualization and analytics software packages and development environments, preferably including KNIME, VS Code, GitLab, Power BI, Jupyter Lab, and Docker-based APIs.
  • Experience with Big Data processing, creating and utilizing containerized building blocks and running containers (APIs) on Kubernetes clusters.
  • Experience with programming and scripting in languages such as Python, R and SQL, and working with data formats including CSV, XML and JSON.
  • Experience performing content extraction from files, databases and systems, including LLM-based embedding models, entity extraction, keyword extraction and content similarity measures.
  • Creative, flexible and proactive in overcoming obstacles.
  • Good drafting, communication and presentation skills in English, including at both technical and non-technical levels.
  • High attention to detail and accuracy.
Educational Qualifications:
  • Master's degree in Computer Science, Engineering or a relevant field.
  • A higher degree in Data Science is preferred.

Frequently asked questions

Who is hiring for the 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul role?
Park Lane Recruitment is hiring for the 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul position, a Shazamme client. Apply directly on the employer's career site.
Where is the 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul job located?
The 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul role with Park Lane Recruitment is based in The Hague, NL.
Is the 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul role full-time or contract?
This is a full time position at Park Lane Recruitment.
What experience level is the 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul role?
The 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul position is aimed at mid-level candidates.
How do I apply for the 2026-0095 Big Data and AI Tech. for Searchable Archives (NS) NETHERLANDS - 8 Jul role at Park Lane Recruitment?
Apply directly on Park Lane Recruitment's career page via the Apply button on this listing. ZammeJobs links straight through to the employer's ATS — no third-party form, no resume database.
Apply direct