1. The long-term solution is to support the `SparseTensor` feature (inherited from `MetaTensor`), but the landing period will be longer.
2. The quick support solution is to express this sparse structure with `DataClass`, and `Class` will be convert to `tuple` in `clean.cc`. The implement can reference `SequenceIterator`, and use `MultiType` to add a overload for add op and gradient update, if necessary.
3.`GatherV2` can use the `ScatterNDUpdate` to update the gradient.