Comparing IoT data lake vs SAP HANA native storage for monetization analytics

We’re designing the storage architecture for our IoT monetization analytics platform and evaluating two approaches: storing raw device telemetry in a data lake (S3/Azure Data Lake) with periodic aggregation to HANA, versus using HANA native storage for everything.

Our scale: 50K connected devices generating billing events, approximately 2TB monthly data growth. We need real-time billing calculations but also historical trend analysis going back 2+ years. The cost difference is significant - data lake storage is roughly 1/10th the cost of HANA, but query performance and integration complexity favor HANA native.

Has anyone implemented a hybrid approach where hot data (last 3-6 months) lives in HANA for real-time monetization while cold data sits in the lake for analytics? Curious about the integration patterns, performance trade-offs, and whether the cost savings justify the architectural complexity.

Good point about schema consistency. How do you handle the historical analytics queries that span both hot and cold data? Do you query both systems and merge results, or do you replicate aggregated summaries back to the lake?

We went full data lake initially and regretted it. Query performance for real-time billing was terrible - 5-8 second latencies on aggregation queries that HANA handles in milliseconds. Ended up moving the last 90 days into HANA and keeping older data in the lake. The integration overhead is real though - you need solid ETL pipelines and careful partitioning strategies.

We use federated queries via HANA Smart Data Access to query the lake directly from HANA. It’s not as fast as native HANA queries, but for historical analytics where sub-second response isn’t critical, it works well. The benefit is a single query interface - your analytics tools just hit HANA, and it transparently federates to the lake for older data. Performance is acceptable for most analytical workloads, though we do pre-aggregate common metrics monthly and store those summaries in HANA to avoid repeated federation overhead.