Language changes:
- Added support for non-type template parameters. Uniform integers
and enums can be used now as template parameters.
- Added dot product functions for unsigned and signed int8 and int16
types. They leverage AVX-VNNI and AVX512-VNNI instructions if
supported by targets (docs).
- Added macro definitions for numeric limits.
New targets:
- avx2vnni-i32x4, avx2vnni-i32x8, avx2vnni-i32x16 with AVX-VNNI
instruction support,
- avx512icl-x4, avx512icl-x8, avx512icl-x16, avx512icl-x32 and
avx512icl-x64 with AVX512-VNNI instruction support.
Code generation:
- Fixed generation of code for GPU when unnecessary vectorized
instruction are used during address arithmetic, e.g., for
accessing fields of varying structures (#2846).
- Improved generated code for cases when foreach loop iteration
domain is less than the target width (#2836 ).
Compiler switches behavior:
- --pic command line flag now corresponds to the -fpic flag of Clang
and GCC, whereas the newly introduced --PIC corresponds to -fPIC.
Bug fixes:
- The implementation of round standard library function was aligned
across all targets. It may potentially affect the results of the
code that uses this function for the following targets: avx2-i16x16,
avx2-i8x32 and all avx512 targets (#2793).
- Fixed cases when unwind info were not generated for functions.
This impacted debugging and profiling on Windows (#2842).
- Fixed broken targets sse4-i8xN and avx2-i8xN (#2800).
More details:
https://github.com/ispc/ispc/releases/tag/v1.24.0
Signed-off-by: Anuj Mittal <anuj.mittal@intel.com>