We’re seeing corrupted character encoding in candidate data exports from our talent management module. International candidate names with accented characters or non-Latin scripts are appearing as question marks or garbled text in our downstream recruiting analytics platform.
For example, a candidate named “José García” appears as “Jos? Garc?a” in the export file, and candidates with Chinese names show up as “???” entirely. This is affecting about 15% of our candidate records and making it impossible to properly track and report on our global recruiting pipeline.
Our export process uses SFTP to transfer CSV files nightly:
Export Format: CSV
Encoding: UTF-8 (configured)
SFTP Transfer: Automated nightly at 2 AM
Destination: recruiting-analytics.company.com
The Dayforce UI displays all candidate names correctly with proper character rendering, so the source data is intact. The corruption only appears in the exported files. We’ve verified that our receiving system supports UTF-8 encoding. What are we missing in the export configuration or SFTP transfer settings that’s causing this character encoding loss?
The SFTP transfer itself can cause encoding issues if the transfer mode isn’t set correctly. Make sure your SFTP client is using binary mode, not ASCII mode. ASCII mode can corrupt multi-byte characters during transfer. Also check if there are any character set translation settings enabled on either the SFTP server or client side that might be converting characters during transfer.
UTF-8 configuration alone isn’t enough - you need to verify the actual byte order mark (BOM) in the exported files. Some systems expect UTF-8 with BOM while others expect UTF-8 without BOM. If there’s a mismatch, the receiving system might misinterpret the encoding and display characters incorrectly. Check your Dayforce export configuration for a BOM setting option.
We encountered this exact issue and discovered that the problem was actually in the middleware layer between Dayforce and our analytics platform. Even though both endpoints supported UTF-8, our ETL tool was reading the file with a default Latin-1 encoding assumption. We had to explicitly configure the ETL input stage to interpret files as UTF-8. Check if there’s any middleware or ETL process that might be re-encoding the data after the SFTP transfer.