A preview is not available for this record, please engage by choosing from the available options ‘download’ or ‘view’ to engage with the material
Description
Bringing AI inference closer to the data source offers significant advantages in cost, privacy and performance. Recent advancements in lightweight GenAI models (i.e., 1-8B parameters) provide a disruptive opportunity to shift GenAI deployment from the cloud to the edge, but alternatives to cloud-based GenAI need to be practical and efficient. This white paper outlines a strategic approach to shift GenAI deployments from cloud-native (i.e., GPU based) solutions to edge (i.e., hardware based) solutions using the built-in compute acceleration of CPU-GPU-NPU (e.g., Intel® Core™ Ultra processors, Intel® Arc™ GPUs) and open source GenAI models. On-device deployment offers low total cost of ownership (TCO), offline capabilities, data sovereignty and reduced latency, making powerful GenAI models accessible across regions and sectors that may previously have faced barriers to deployment.