Summary: 6 instances, 4 unique Text Count // FIXME: 2-way bank conflict 1 // allocate memory for all features (FIXME: 4 GB barrier on some devices, need to split to multiple buffers) 1 // TODO: try to avoid bank conflict here 1 // FIXME: how to do the above on AMD GPUs?? 3