Source/CNTKv2LibraryDll/proto/onnx/CNTKToONNX.cpp (55 lines):
	- line 1237: // TODO: is there a thing as nested loop?
	- line 1372: // TODO: specific to LSTM. icfo (CNTK) to iofc(ONNX)
	- line 1444: // TODO: specific to LSTM. icfo (CNTK) to iofc(ONNX)
	- line 1543: // TODO: verify that srcTensors has consistant shapes
	- line 1847: // TODO: ONNX data types other than float and double are
	- line 2054: // TODO: for now all param cases are for free dimension
	- line 2649: // TODO: sanity check for all variables to have the same shape and data types.
	- line 2685: // TODO: sanity check for all variables to have the same shape and data types.
	- line 2785: // TODO:
	- line 2863: // TODO: following commented out attributes are not supported. Use default.
	- line 2869: // TODO: implement peephole
	- line 2890: // TODO: enable sequence_lens. It requires additional model input of batched sequence data layout.
	- line 2955: // TODO: Except X, all other inputs to LSTM are treated as constant.
	- line 2968: // TODO: make bidirectional LSTM work by figuring out output data
	- line 2986: // TODO: sanity check for all variables to have the same shape and data types.
	- line 3052: // TODO: sanity check for all variables to have the same shape and data types.
	- line 3099: // TODO:
	- line 3215: // TODO: to be consistant with RNN and LSTM where Yhs is the only output.
	- line 3222: //    // TODO: batchSize is fixed to one. Needs to find out how to handle bacth axis as a free dimension.
	- line 3232: // TODO: Except X, all other inputs to GRU are treated as constant.
	- line 3245: // TODO: make bidirectional GRU work by figuring out output data
	- line 3250: // TODO: uncomment this code once LotusRT output shape matches ONNX
	- line 3261: // TODO: sanity check for all variables to have the same shape and data types.
	- line 3312: // TODO:
	- line 3433: //// TODO: make bidirectional RNN work by figuring out output data
	- line 3438: //// TODO: uncomment this code once LotusRT output shape matches ONNX
	- line 3518: // TODO: add a test case for this code path.
	- line 3902: // TODO: extend this method to handle bidirection LSTMs.
	- line 4237: // TODO: how to handle batch size not being one?
	- line 4443: // TODO: cannot call ProcessOutputs because we want the final output to have the expected ArgNode name
	- line 4456: // TODO: how to handle cases where batch_size is not 1?
	- line 4586: // TODO: implement Where op.
	- line 4716: // TODO: Waiting Skype smart reply with attention model before enabling the functionality of tracking sequence dimension.
	- line 4736: // TODO: Waiting Skype smart reply with attention model
	- line 4748: // TODO: Waiting Skype smart reply with attention model
	- line 5272: // TODO: UniqueNodeNameStorage is causing model validation failure.
	- line 5318: // TODO: this shall be handled internal to UniqueNodeNameStorage
	- line 5405: // TODO: uncomment this code once bidirectional LSTM is supprted.
	- line 5659: // TODO: make ONNX MeanVarianceNormalization and CNTK test work with sequential models.
	- line 5731: // TODO: do we need to get blockroot if it is a block function?
	- line 5832: // TODO: ToBatchAxis also override batch size.
	- line 5996: //// TODO: to skip a batch/sequence pack/uppack, we need
	- line 6119: // TODO: verify - ONNX specifies that ImageScaler always need a batch axis
	- line 6157: // TODO: if it is an identity op, we shall peek its input node to find the correct tensor element type.
	- line 6349: // TODO: handle all cases
	- line 6393: // TODO: investigate whether this is really needed.
	- line 6754: // TODO: handle hasSequenceAxis cases
	- line 6781: // TODO: more test is needed to cover general cases where batch and sequence axis are involved
	- line 7051: // TODO : crop_automatic
	- line 7074: // TODO: Try removing this branch. May not be needed after batch dimension fix.
	- line 7352: // TODO: use ONNX Expand once ONNX version 7 is supported
	- line 8915: // TODO: bias is at src->Inputs()[1] (inputs[1]) for New Conv case.
	- line 9171: MapAndUpdateONNXType(onnxOpName, true, inputIndex, input.GetDataType(), &inputArgType); // TODO: Is this needed? Probably not.
	- line 9229: // TODO: Below line assumes that the first output of the op (e.g. conv)
	- line 9254: MapAndUpdateONNXType(onnxOpName, false, 0, output.GetDataType(), &outputArgType); // TODO: Is this needed? Probably not.


Source/SGDLib/SGD.cpp (41 lines):
	- line 74: // TODO: BUGBUG: if not starting from checkpoint, need to synchronize initial model
	- line 289: net->AllocateAllMatrices(evaluationNodes, additionalNodesToEvaluate, criterionNodes[0]); // TODO: use criterionNodes.front() throughout
	- line 292: // TODO: instead, remember the nodes directly, to be able to handle both float and double nodes; current version will crash for mixed networks
	- line 294: // TODO: ^^ change to shared_ptr or unique_ptr
	- line 318: // TODO: Should this be done in SGD::Adapt()?
	- line 319: // TODO: Redo this leveraging that we now have shared_ptrs. It is probably even OK if both networks share feature nodes.
	- line 320: // TODO: Then we can also share the MBLayout; which currently is copied by value.
	- line 330: // TODO: After the change to shared_ptrs, this may no longer be necessary.
	- line 342: // only one criterion so far TODO: support multiple ones?
	- line 497: // TODO this assumes training is picked up with nodes with zero parameters
	- line 541: for (int i = startEpoch; i < (int) m_maxEpochs; i++) // TODO: why is this an int, and not a size_t?
	- line 710: // TODO: This was only printed if >1 eval criterion. Why? Needed?
	- line 1017: StreamMinibatchInputs* inputMatrices, // TODO: why is this a pointer?
	- line 1105: // TODO: move the two-forward-pass support out of the reader, make a first-class citizen.
	- line 1201: // TODO: is it guaranteed that the GPU is already completed at this point, is it safe to overwrite the buffers?
	- line 1234: // TODO: move this to that function as well--just tired to pass everything as arguments
	- line 1235: // TODO: We should do this right after the GetMinibatch() call, since that's where these changed.
	- line 1237: // TODO: original code did not call this for actualMBSize == 0
	- line 1248: // TODO: currently we only support one node for regularization
	- line 1252: refNet->GetMBLayoutPtrOfNetwork()->CopyFrom(net->GetMBLayoutPtrOfNetwork()); // TODO: This is UNTESTED (before this was missing, seemingly inconsistently)
	- line 1346: Matrix<ElemType>* currParamsGradient = &(node->Gradient()); // TODO: we can use shared_ptrs now
	- line 1375: m_gradHeader->numEvalNode = evaluationNodes.size(); // TODO: rename numEvalNode (plural)
	- line 1449: // TODO: Check why l2Factor is not applied to L1. Bug?
	- line 1535: let trainLossSinceLastLogged    =      epochCriterionSinceLastLogged.Average(); // TODO: Check whether old trainSamplesSinceLastLogged matches this ^^ difference
	- line 1569: fprintf(stderr, (", %2." + to_string(mbProgNumPrecision) + "f%%").c_str(), mbProg * 100); // --TODO: use a * format?
	- line 1674: // TODO: move the two-forward-pass support out of the reader.
	- line 1724: // TODO: merge with training criteria
	- line 1833: // TODO: move these into GetMinibatchIntoNetwork()  --but those are passed around; necessary? Can't we get them from 'net'?
	- line 1879: double learnRatePerSample = 1.0f / 8.0f / 0.618f / sqrt((double) m_mbSize[epochNumber]); // TODO: comment on these magic constants
	- line 2207: // TODO: if this is too sensitive, we can add a margin on the bases of percentage of
	- line 2286: // TODO: move the two-forward-pass support out of the reader.
	- line 2296: // TODO: use GetMinibatchIntoNetwork().
	- line 2392: return; // no need to do anything if already initialized. TODO: make it singleton
	- line 2525: //By defualt, V1 uses UnitGain momentum. TODO: Do we need to enable V1 with non unit gain update?
	- line 2841: template <class ElemType> // TODO: needed?
	- line 2886: vector<string> errMsgs; // TODO: These are created but actually not returned, only their count is checked.
	- line 2920: // TODO: why is this value not used?
	- line 2935: double mbEvalCriPos = criterionNodes[npos2]->Get00Element(); // TODO: make Get00Element() a function of ComputationNodeBase
	- line 3073: // TODO: mbSize and truncated should be specified differently for truncated BPTT:
	- line 3162: m_gradType.targetAdagradAvDenom = configSGD(L"fsAdagradTargetAvDenom", 1.0); // TODO: deprecated parameter kept for back compat (set to 0.0025 inconjunction with reenabling the static bug)
	- line 3189: m_gradientCheckSigDigit = configSGD(L"sigFigs", 6.0); // TODO: why is this a double?


Source/Readers/LMSequenceReader/SequenceReader.cpp (30 lines):
	- line 86: bool SequenceReader<ElemType>::CheckIdFromLabel(const std::string& labelValue, const LabelInfo& labelInfo, unsigned/*TODO: LabelIdType?*/& labelId)
	- line 203: // TODO: should ignore <s>, check the sentence ending is </s>
	- line 216: RuntimeError("Input label expected to be a category label");  // TODO: ensure this at config time (maybe keep an assert() here as a reminder to the reader)
	- line 250: // TODO: Why is this produced by the reader, and not just realized through the use of delay nodes in the network?
	- line 318: // TODO: comment what this does, in a function called UpdateDataVariables
	- line 333: // TODO: move this to class File as well
	- line 400: // TODO: This function seems to be never called. Remove it if that is the case.
	- line 443: // TODO: use a reference for m_labelInfo[index]
	- line 550: // TODO: how many of these do we have? labelInfoIn, Min, Out, Max, and there must be exactly 2?
	- line 679: // TODO: need to go down to all levels, maybe search for sectionType
	- line 716: // TODO: we would need to add a sequenceMap type here as well
	- line 788: m_parser.SetFilePosition(0);    // TODO: can this ever be set to not 0?
	- line 1065: // TODO: Document what this is. It seems we can fill specific hard-coded inputs with something interesting.
	- line 1223: const size_t jRand = jSample; // TODO: This seems unfinished.
	- line 1282: // TODO: What is this? Debug code?
	- line 1391: if (features.size() > 1) // TODO: If this ever fails, please remove this check. One sample had 2 sections, but I could not see where they were used; this check is to verify that.
	- line 1397: // TODO: Is it at all meaningful to allow no features section?
	- line 1411: // It is possible to specify labelType = "none" for either. --TODO: I only tested doing so for the first.
	- line 1526: if (word4idx.size() != nwords) // TODO: Why not infer it at this point in time? If labelInfo.dim == 0 then set if to word4idx.size()
	- line 1547: // TODO: Clean this up--do this only if numIds is 0 (no class or mapping read), otherwise require them to be identical or 0.
	- line 1575: // TODO: ^^ This should depend on the sequences themselves.
	- line 1625: m_labelsIdBuffer = new /*typename IDataReader<ElemType>::*/LabelIdType[mbSize]();     // TODO: no "new" please! Use a vector
	- line 1794: #ifdef _MSC_VER // make some old configurations reproducable (m_cacheBlockSize used to be a constant)  --TODO: remove in a few months
	- line 1797: srand(++m_randomSeed); // TODO: older code did not have that; so no idea what random seed was used
	- line 1856: pos++; // consume it   --TODO: value is not used after this
	- line 1895: // TODO: Why not fail here?
	- line 1899: // TODO: validate whether the passed matrices set matches reader section definitions
	- line 2003: Matrix<ElemType>& nbs = *matrices[L"numberobs"]; // TODO: what is this? We fall back to a different node?
	- line 2042: // m_pMBLayout->SetWithoutOr(uttPos, timePos, MinibatchPackingFlags::SequenceStart);   // TODO: can we use Set() (with OR)?
	- line 2049: // TODO: this should have been renamed to CopyMBLayoutTo(), but it had the wrong signature??


Source/ComputationNetworkLib/TrainingNodes.h (30 lines):
	- line 166: InputRef(1).InvalidateMissingGradientColumns(fr); // TODO: This should not be necessary.
	- line 289: /*TODO: merge with call site*/ void BackpropToLeft(const Matrix<ElemType>& logOfRight, Matrix<ElemType> inputGradientValues,
	- line 295: /*TODO: merge with call site*/ void BackpropToRight(Matrix<ElemType>& leftDivRight,
	- line 378: // TODO: share most code with MatrixL2RegNode
	- line 407: /*TODO: merge with call site*/ void BackpropToS(Matrix<ElemType>& gradientOfL1Norm,
	- line 925: // TODO: share most code with MatrixL1RegNode
	- line 949: /*TODO: merge with call site*/ void BackpropToS(Matrix<ElemType> inputGradientValues, const Matrix<ElemType>& gradientValues, const Matrix<ElemType>& inputFunctionValues, const Matrix<ElemType>& functionValues)
	- line 951: ElemType v = gradientValues.Get00Element() / (functionValues.Get00Element() + EPS_IN_INVERSE); // TODO: GPU inefficiency
	- line 1022: // ^^ TODO: we can merge these two
	- line 1048: } // TODO: really? Return a reference to a local? TODO: change to const? and call it GetEvalMode()
	- line 1078: // TODO (this does not really break it since for full matrices, class Matrix will resize by itself)
	- line 1105: MaskMissingColumnsToZero(m_logSoftmax, InputRef(1).GetMBLayout(), fr); // TODO: is this the right way to neutralize gaps?
	- line 1110: // TODO: are we treating gaps correctly here?
	- line 1115: // TODO: are we treating gaps correctly here?
	- line 1603: // TODO: Resize temp matrices here (not doing so does not really fail since for full matrices, class Matrix will resize by itself)
	- line 1646: Matrix<ElemType> softMax_t = m_softMax.ColumnSlice(sz, nbr_wrd); // TODO: declare these outside of the loop to avoid the malloc
	- line 1652: // TODO: can we use 'true' here instead? Above transposition hack won't work with row slices. 'obs' not used elsewhere
	- line 1728: //  - transition scores: square transition matrix,  --TODO: log?
	- line 1780: Matrix<ElemType> funcVal = Value(); // TODO: This just creates a 1x1 matrix set to 0.
	- line 1841: /*TODO: merge with call site*/ void ForwardPropS(Matrix<ElemType> postprob, Matrix<ElemType> alpha, Matrix<ElemType> beta, Matrix<ElemType>& functionValues, const Matrix<ElemType>& lbls, const Matrix<ElemType>& pos_scores, const Matrix<ElemType>& pair_scores, int& firstLbl, int& lastLbl, const int iStep = 1)
	- line 2015: Matrix<ElemType> mAlpha; // TODO: m_Alpha etc.
	- line 2054: m_temp->AssignDifferenceOf(InputRef(0).ValueFor(fr), *m_classZeroLabels); // TODO: need a slice for m_classZeroLabels?
	- line 2058: m_temp->AssignElementProductOf(*m_temp, InputRef(2).ValueFor(fr)); // TODO: is Input(2) minibatch data? Confirm
	- line 2061: m_temp->AssignElementDivisionOf(*m_temp, *m_result); // TODO: this is in-place--does this function allow that?
	- line 2099: // TODO: verify that all these operations on m_result really can do in-place (or use different methods instead)
	- line 2378: //      TODO: This must be configured in a generic fashion where tensor axes are chosen along which parameters are tied.
	- line 2400: // TODO: Change all of these throughout the codebase to 'class enum'. Also change all places where we still use integer constants.
	- line 2742: // TODO: Move this out. Follow the same pattern as the RNN node. But can't without requiring another buffer.
	- line 2744: sliceInputGrad,                   // (out) gradient for data input goes here  --TODO: Check if cudnn engine adds the gradient, or just overwrites (BUGBUG). CNTK engine is OK.
	- line 3009: // TODO: This should not be a config option, but rather inferred from dimensions of the Parameters.


Source/CNTK/BrainScript/BrainScriptEvaluator.cpp (28 lines):
	- line 116: let val = dynamic_cast<Bool *>(value.get()); // TODO: factor out this expression
	- line 301: let firstIsDouble  = firstVal.Is<Double>(); // TODO: make this a std::array?
	- line 320: // TODO: test this
	- line 335: if (firstIsDouble) // Double/x --> only allow 1/x, implement as Reciprocal  --TODO: fix once we have ElementDivideBy
	- line 356: else if (e->op == L"**") InvalidInfixOpTypes(e); // TODO: implement this
	- line 369: takesBool = true; operationName = L"ElementTimes";  // implemented as element product  --TODO: needs a C++ node
	- line 375: takesBool = true; InvalidInfixOpTypes(e); // TODO: implement this, needs a C++ node
	- line 475: // TODO: implement this; also as basis for overriding parameters from the cmd line
	- line 483: exprPath; // TODO: create a composite dictionary
	- line 553: // TODO: This implementation takes a lot of stack space. Should break into many sub-functions.
	- line 613: // TODO: document namedArgs--does it have a parent scope? Or is it just a dictionary? Should we just use a shared_ptr<map,ConfigValuPtr>> instead for clarity?
	- line 636: auto argVal = move(args[i]); // value of the parameter  --TODO: Is this ever unresolved?
	- line 647: argScope->Add(id, failfn, move(argVal)); // TODO: is the failfn the right one?
	- line 716: // TODO: no scope here? ^^ Where does the scope come in? Maybe not needed since all values are already resolved? Document this!
	- line 799: // TODO: where does the current scope come in? Aren't we looking up in namedArgs directly?
	- line 801: // TODO: change this ^^ to the const & version of Apply() once it is there
	- line 821: // note on exprPath: since - has only one argument, we do not include it in the expressionPath  --TODO: comment correct?
	- line 838: // note on exprPath: since ! has only one argument, we do not include it in the expressionPath  --TODO: comment correct?
	- line 856: #else       // This does not actually work.  --TODO: find out why
	- line 925: // TODO: RegexReplace()
	- line 936: // TODO: RegexReplace!
	- line 960: // TODO: change to taking a regular format string and a :: array of args that are checked. Support d,e,f,g,x,c,s (s also for ToString()).
	- line 961: // TODO: :: array. Check if that is the right operator for e.g. Haskell.
	- line 962: // TODO: turn Print into PrintF; e.g. PrintF provides 'format' arg. Printf('solution to %s is %d', 'question' :: 42)
	- line 980: else if (arg.Is<ConfigRecord>()) // TODO: should have its own ToString() method
	- line 983: let memberIds = record->GetMemberIds(); // TODO: test this after change to ids
	- line 998: else if (arg.Is<ConfigArray>()) // TODO: should have its own ToString() method
	- line 1193: // main TODO items:


Source/ComputationNetworkLib/ComputationNode.h (25 lines):
	- line 70: // TODO: Make this a trace option, e.g. enabled by the ComputeEnvironment.
	- line 96: // TODO: OperationName calls static TypeName which does not match the actual type names in that the 'Node' is missing.
	- line 228: protected:                // TODO: should be fully encapsulated here
	- line 299: public /*protected*/ ComputationNetworkOwnedNodeState, // TODO: figure the 'protected' business out, somehow the 'friend' thing does not work
	- line 321: // TODO: should m_learningRateMultiplier be set to 0? Or should every node have a way to add its own say on the learning rate for all its inputs?
	- line 505: if (m_sampleLayout.GetRank() < 1 || ((m_sampleLayout.GetRank() > 2) && notFlattenableTo2D)) // note: scalars are not stored as tensors of rank 0, but rather as 1-dim vectors. TODO: clean this up some day
	- line 590: if (!m_pMBLayout) // TODO: temporary workaround to Check_t() calls which call this. TODO: Delete the first arg from Check_t() after memshare merge.
	- line 678: // TODO: two sets of functions, choose one
	- line 857: // TODO: This should be a method of ComputationNetwork, not ComputationNode.
	- line 935: // TODO: This should be a method of ComputationNetwork, not ComputationNode.
	- line 980: float m_learningRateMultiplier;    // update parameters? Only used for LearnableParameters.    --TODO: Should we make this a member of LearnableParameters actually? And require a type cast? Currently it is read out for all leaves.
	- line 1289: auto p = dynamic_cast<ComputationNode<ElemType>*>((ComputationNodeBase*)vp); // TODO: check that all void* casts really come from ComputationNodeBasePtr; or add a method ToVoidPtr(). Or get rid of the void*?!
	- line 1579: // TODO: This is only used for testing whether a gradient has been allocated. Maybe reduce to bool HasGradient()?
	- line 1691: // TODO: Are all these meant to read out a scalar? Then rename and verify dimensions.
	- line 1714: // TODO: TensorShape should have a method to
	- line 1794: // TODO: move to -Base (or -Network?)
	- line 1836: // TODO: customize this function for all nodes that uses temp internal matrices.
	- line 2093: // TODO: similar to DumpInfo; used by ExperimentalNetworkBuilder test implementation
	- line 2231: WriteFormattingOptions() : // TODO: replace by initializers?
	- line 2251: // TODO: Most of these are reduce nodes that output a single number, no MBLayout. Maybe abstract those out further
	- line 2300: // TODO: There are too many of these. This indicates improper class hierarchies.
	- line 2337: // TODO: This is a bit indirect. Can it be done more nicely?
	- line 2355: // TODO: We can use this interface in more places.
	- line 2379: #define UsingComputationNodeMembers /*without OperationName; needed to support inconsistent pattern of InputValue--TODO: This comment it out of date. */ \
	- line 2566: // TODO: This is a stopgap. Is this the right thing to do? It changes the matrix type in-place.


Source/CNTKv2LibraryDll/API/CNTKLibrary.h (25 lines):
	- line 92: /* TODO:
	- line 186: // TODO: FPGA
	- line 875: // TODO: The set methods should be offered in template from
	- line 1244: //TODO: Do we assume there is only one batch axis in the whole system?
	- line 1387: /// TODO: We need to have native support for DictionaryValue<vector> and DictionaryValue<NDArrayView>.
	- line 2081: // TODO: This should be a private but if not made public, the python bindings build complains about an unresolved external
	- line 2109: // TODO: a better way to get hash value?
	- line 2124: // TODO: Variable equality should be based on uids.
	- line 2287: // TODO: Constructor to move a specified NDArrayView value
	- line 2372: // TODO: Constructor to move a specified NDArrayView value
	- line 2465: // TODO: Variable hash should be based on uid.
	- line 3678: // TODO: Do we need an Unregister to unload the module?
	- line 4288: /// TODO: Specify the constraints on the shapes of the operands.
	- line 4289: /// TODO: Document inferInputRankToMap
	- line 4303: /// TODO: Specify the constraints on the shapes of the operands.
	- line 4304: /// TODO: Document inferInputRankToMap
	- line 4313: /// TODO: Specify the constraints on the shapes of the operands.
	- line 4314: /// TODO: Document inferInputRankToMap
	- line 4324: /// TODO: Specify the constraints on the shapes of the operands.
	- line 4331: /// TODO: Specify the constraints on the shapes of the operands.
	- line 4712: /// TODO:
	- line 5031: //TODO: enable reference mb size for each rate
	- line 5082: //TODO: replace the struct option with dictOptions
	- line 6228: // TODO: Currently this is a workaround to free static MPIWrapper, it will go away soon.
	- line 6515: // TODO: Encapsulate (freq, firstToWrite) as an update schedule type.


Source/Math/Matrix.cpp (23 lines):
	- line 167: // TODO: Reformat DISPATCH... macros to the following form:
	- line 200: m_devicesTransferedTo[0]    = other.m_devicesTransferedTo[0]; // TODO: spelling
	- line 261: // TODO: Why do we need these typecasts? (without it will fail with "cannot access private member declared in class 'Microsoft::MSR::CNTK::CPUMatrix<float>'")
	- line 874: // TODO: Implement optimized diagonal functions for sparse matrices. For now use the DiagonalToDense instead.
	- line 1022: // TODO: Can we remove this, and have users use SetValue() instead? To avoid this potential error?
	- line 1069: DecideAndMoveToRightDevice(*this, idx, a); // TODO: only move target if beta != 0
	- line 1079: // TODO replace by more performant version directly on GPU that does not require the round-trip over CPU.
	- line 1103: DecideAndMoveToRightDevice(*this, idx, a); // TODO: only move target if beta != 0
	- line 1113: // TODO replace by more performant version directly on GPU that does not require the round-trip over CPU.
	- line 1273: // TODO: do we need all these 'this->'?
	- line 1930: // TODO: should this function test whether the size is changing, and skip if it isn't? We have at least one explicit test for this code calling this (recurrent node)
	- line 2385: // TODO: We need ternary ops where the output storage is separate.
	- line 4167: // TODO: Comment why we need a second ElemType.
	- line 4168: // TODO: Move the shared core functions to the front of this source file.
	- line 4175: // TODO: This is called somewhat inconsistently, sometimes with a=*this, sometimes with b=*this.
	- line 4225: else if (deviceIdB == deviceIdC && deviceIdB != CPUDEVICE) // TODO: why not the other two combinations?
	- line 4407: ElemType* arr = m_GPUMatrix->CopyToArray(); // TODO: unnecessary allocation/copy; why not make this a vector that we move over as an rvalue ref?
	- line 5425: // TODO: two lines above should be changed as follows:
	- line 5444: // TODO: Handle these cases:
	- line 5684: // TODO: consider swapping the arguments in this case
	- line 5892: // TODO: these are scalar operations--why are they in Matrix?
	- line 5909: // TODO: use static LogAdd() as defined in TensorOps.h
	- line 5920: y = temp; // TODO: ::swap(x,y)?


bindings/python/cntk/layers/layers.py (23 lines):
	- line 287: reduction_rank=1, # (0 means input has no depth dimension, e.g. audio signal or B&W image)  --TODO: call it item_rank?
	- line 389: # So we emulate those dimensions on this level. TODO: Once this is suppored by the C++ code, remove the emulation here.
	- line 410: # TODO: Test whether this is needed. We should instead just take whatever reduction dimension is given here as that of the input.
	- line 423: # TODO: Should we cater to the special case of 1D convolution for text? I.e. sequential only (filter_shape=()).
	- line 451: # TODO: We still have those axes in the kernel. Solve this once the C++ implementation supports 1D directly.
	- line 471: # TODO: sharing = false? I'd need that for speech feature extraction.
	- line 472: # TODO: should we allow to pass fixed weights instead? Like for Embedding? E.g. audio filters
	- line 473: # TODO: this is not a convolution but a correlation, and W's shape has input and output depth reverted.
	- line 475: # TODO: conflict of parameter order: filter_shape or num_filters first?
	- line 478: # TODO: add a test case for passing a numpy array as initial values
	- line 489: reduction_rank=1, # (0 means input has no depth dimension, e.g. audio signal or B&W image)  --TODO: call it item_rank?
	- line 613: # So we emulate those dimensions on this level. TODO: Once this is suppored by the C++ code, remove the emulation here.
	- line 634: # TODO: Test whether this is needed. We should instead just take whatever reduction dimension is given here as that of the input.
	- line 646: # TODO: Should we cater to the special case of 1D convolution for text? I.e. sequential only (filter_shape=()).
	- line 663: # TODO: if reduction_rank==0 and sequential, we don't need the fake reduction axis, just use the sequential axis instead
	- line 683: # TODO: We still have those axes in the kernel. Solve this once the C++ implementation supports 1D directly.
	- line 698: # TODO: make sure the xD versions have all the needed parameters
	- line 869: # TODO: need to merge with above. Can it simply be transpose=True?
	- line 1105: # TODO: add sequential mode like Convolution()
	- line 1296: # TODO: merge this. Test: Tests\EndToEndTests\CNTKv2Python\Examples\deconv_MNIST_test.py, Tests\EndToEndTests\Examples\Image\GettingStarted\07_Deconvolution
	- line 1311: # TODO: should the rate(s) be default_options?
	- line 1393: # TODO: map_rank is broken. We should specify the #slowest-changing axes. E.g. 1 would work for images and vectors. Requires C++ change.
	- line 1502: scale = Parameter(_INFERRED, init=initial_scale, name='scale')  # TODO: if this gets usage then offer a Softplus version like Stabilizer() for stability?


Source/ComputationNetworkLib/ComputationNetwork.h (22 lines):
	- line 111: // TODO: modify file format to know this; then eliminate the <ElemType> dependency (and in some future, allow nodes to be different)
	- line 230: ResetEvalTimeStamps(); // invalidate all m_value fields  --TODO: redundant (called over again for every root node). Make this private and only call for sets of nodes.
	- line 348: // This is meant to be used by FormRecurrentLoops().  TODO: Hopefully this can be not done anymore some day.
	- line 361: // TODO: This is currently not immutable because it gets patched w.r.t. recurrent loops. Ideally we don't patch. Need to review and verify that it is sufficient.
	- line 403: const auto& featureNodes = FeatureNodes(); // TODO: a getter; should be called GetFeatureNodes()
	- line 422: // TODO: Instead of passing numAllSamples in here, we should determine it from the inputs in case of no layout. Or simply forbid this case.
	- line 429: return numAllSamples; // TODO: Return the actual number of samples, by inquiring our own input nodes; then eliminate the numAllSamples parameter.
	- line 537: // TODO: Why are all these static, but then take a network as the first argument? --> make them class members
	- line 550: const double& hsmoothingWeight, // TODO: Why are all these passed by reference?
	- line 664: for (const auto& groupNode : nodeGroup) // TODO: is there an STL algorithm?
	- line 727: // TODO: there should be a map from output nodes to inputs, so that this operation doesn't take square time
	- line 753: // TODO: This distinction should not be necessary anymore. Calling GetEvalOrder(nullptr) will have the same effect.
	- line 766: for (const auto& node : GetEvalOrder(rootNode)) // TODO: verify that no use of this requires the actual eval order, then change to GetAllNodesForRoot()
	- line 863: // TODO: can we just define it as private without implementation?
	- line 871: // TODO: can we just define it as private without implementation?
	- line 880: // TODO: move these to ComputationNetworkBuilder.cpp
	- line 893: // TODO: not very nice--need to fix way more outside to get this right
	- line 931: // TODO: We should verify that indeed this node is not referenced by other nodes or node groups,
	- line 1153: ComputationNodeBasePtr m_sourceNode; // one of the nodes of the loop   --TODO: What is the special meaning of this node? It seems to always be a delay node.
	- line 1232: DEVICEID_TYPE m_deviceId; // TODO: is this shared by all nodes?
	- line 1240: // TODO: Are these meant to be disjoint?
	- line 1286: // TODO: does this apply to anything else besides temporary node-internal intermediate results? What, for example?


Source/ComputationNetworkLib/ComputationNetworkEvaluation.cpp (21 lines):
	- line 182: childrenInThisLoop, childrenInOuterLoop; // TODO: think through what these mean when coming from PAR mode
	- line 256: // TODO: we should do this in a constructor.
	- line 273: // TODO: Once we do nested loops, then the FrameRange argument to this will refer to the outer loop.
	- line 321: childrenInThisLoop, childrenInOuterLoop;    // TODO: think through what these mean when coming from PAR mode
	- line 374: // TODO: should we deallocate in opposite order?
	- line 397: // TODO: Check for IsPartOfLoop(). Also why not store the loop id in the node for direct lookup?
	- line 400: if (std::find(iter->m_nestedNodes.begin(), iter->m_nestedNodes.end(), node) != iter->m_nestedNodes.end()) // TODO: should this loop need to be a method of SEQTraversalFlowControlNode?
	- line 407: // TODO: Would it be sufficient to check against our own time stamp, so that we can use a unified time-stamping mechanism? Then we'd not need this special check for delayed nodes; just check all inputs against our own time stamp.
	- line 415: // TODO: when ShiftNode lands, check this as well. Ideally just test whether ptr is a IRecurrentNode
	- line 423: // TODO: do this on PARTraversalFlowControlNode
	- line 497: // TODO: This is in a somewhat partial state in that we now have a global eval order (keyed by a nullptr), but don't use it yet.
	- line 521: // TODO: Do not cache this before reordering; get list & pass to FormRecurrentLoops() which reorders it, then store it (such that GetEvalOrder(nullptr) is always valid w.r.t. loops).
	- line 526: // TODO: Move this further down; or decide whether the 'nullptr' version is needed, other than ResetMBLayouts() which could use the global order and filter by itself.
	- line 531: // TODO: Don't use m_inputValues, traverse ourselves, to remove dependency on FormEvalOrder().
	- line 593: // TODO: This is not ideal. We will also need on-demand compilation, to allow any node to be used as an output after the fact.
	- line 641: // TODO: use if (!Is<ITakesDynamicAxis>(node))...
	- line 650: // TODO Remove m_pMBLayoutOfNetwork altogether. See issue 358.
	- line 714: // TODO: In the future we should validate not on the flat list but the PARTraversalFlowControlNode structure. Then this will be unnecessary.
	- line 772: for (auto& child : children) // TODO: do we need a check that this is stable if isFinalValidationPass?
	- line 1062: // TODO: find a simple topological order and allocateEvalMatrices on that order directly
	- line 1253: // TODO: next step: use PARTraversalFlowControlNode::AllocateGradientMatricesForInputs() and ReleaseMatricesAfterBackprop()...


Source/ComputationNetworkLib/ReshapingNodes.h (20 lines):
	- line 112: // TODO: We should allow to reduce to a 0-length tensor if the dimension is 0
	- line 217: // TODO:
	- line 411: // TODO: how to deal with boundary flags?
	- line 434: // TODO: Once we do in-place, the above must include a copy-to-self check (either here or inside the tensor lib).
	- line 476: // TODO: Once we do in-place, the above must include a copy-to-self check (pay special attention to adding vs. copying).
	- line 1169: // TODO: This is very close to the planned SpliceNode (just make m_spliceDim actually configurable) except for splicing along time.
	- line 1179: RowStackNode(DEVICEID_TYPE deviceId, const wstring& name, int spliceDim = 1/*TODO: complete this*/)
	- line 1371: // TODO: Or should we add an additional dimension?
	- line 1596: InputRef(0).ValueAsMatrix().AssignDiagonalValuesTo(ValueAsMatrix()); // TODO: use tensor lib; this is a stride operation
	- line 1612: // TODO: use tensor lib, then this will be easy, no memsharing needed
	- line 1751: // TODO: This ReshapeNode should no longer be used. Its function will be taken over by Transpose and the Reshape that follows this one below.
	- line 1773: // TODO: This definition is poor; we should use a different node name, and specify the factor directly.
	- line 1784: // TODO: Changing the TensorShape does not seem to belong here.
	- line 1841: // TODO: Clarify/resolve the semantic overlap between BeginForwardProp() and UpdateFunctionMBSize().
	- line 1868: // Call this at the end because this will resize Value(), but that requires the updated MBLayout. TODO: Clarify the sequence of events. Should we update the MBLayout in UpdateFunctionMBSize()?
	- line 1891: // TODO: It does not make sense to run LegacyReshapeNode frame-by-frame inside a loop, because it changes the time base.
	- line 1964: // TODO: We need to decide what reshaping means in presence of a tensor.
	- line 2260: - TODO: use a bool vector for the time dimensions --> Gather()
	- line 2284: - TODO:
	- line 2332: - elementwise nonlinearities as usual  [TODO: complete them]


Source/Common/Include/ScriptableObjects.h (20 lines):
	- line 114: // TODO: change this to variadic templates, then we can instantiate everything we need through this
	- line 134: // TODO: unify with ComputationNodeBase
	- line 152: // TODO: move these out from this header into some more general place (I had to move them here because otherwise CNTKEval failed to compile)
	- line 246: // TODO: separate this out from BrainScript to an interface that still does type casts--possible?
	- line 291: // TODO: somehow the constructor overload from Thunk function fails to compile, so for now use MakeThunk instead
	- line 375: // TODO: there is some duplication of type checking; can we unify that?
	- line 401: // TODO: factor these lines into a separate function
	- line 406: if (p == nullptr) // TODO: can we make this look the same as TypeExpected in BrainScriptEvaluator.cpp? We'd need the type name
	- line 415: if (!p) // TODO: can we make this look the same as TypeExpected in BrainScriptEvaluator.cpp? We'd need the type name
	- line 430: // TODO: ^^ it seems by saving the name in the ConfigValuePtr itself, we don't gain anything; maybe remove again in the future
	- line 504: // TODO: change all id args to wide strings, then update the code.
	- line 559: // TODO: Add() does not yet correctly handle the failfn. It is meant to flag the location of the variable identifier
	- line 599: // TODO: can ConfigRecordPtr be IConfigRecordPtr?
	- line 711: typedef std::map<std::wstring, ConfigValuePtr> NamedParams; // TODO: maybe even not use a typedef, just use the type
	- line 757: // TODO: define an overload that takes const & for external users (which will then take a copy and pass it on to Apply &&)
	- line 822: struct ConfigurableRuntimeType // TODO: rename to ScriptableObjects::Factory or something like that
	- line 825: // TODO: is this ^^ actually still used anywhere?
	- line 827: // TODO: we should pass the expression name to construct() as well
	- line 831: // TODO: should this be a static member of above class?
	- line 932: #if 1                                                // TODO: test whether this works correctly w.r.t. typecasting


Source/CNTKv2LibraryDll/proto/onnx/ONNXToCNTK.cpp (18 lines):
	- line 637: // TODO: refactore commom code for float and double
	- line 918: // TODO: what about double?
	- line 969: // TODO: specific to LSTM. icfo (CNTK) to iofc(ONNX)
	- line 1007: // TODO: batch shall be one?
	- line 1090: // TODO: what about double?
	- line 1163: // TODO: this incompatibility needs further investigation.
	- line 1242: // TODO: what about double?
	- line 1507: // TODO: Do we need to take care of the sequence axis here (like before)?
	- line 2256: // TODO: avoid hardcoded values
	- line 2294: // TODO: this does not work if mean/var inputs are not constant/parameters.
	- line 3122: // TODO: this only works in this specific case.
	- line 3145: // TODO: Check whether we should use node output arg name for the check below.
	- line 3190: // TODO: make node map to vector of FunctionPtr
	- line 3211: // TODO: this is for sink or source - what type of variable for it?
	- line 3410: // TODO: Pad op is not intuitative or it could be a bug. One would think begins before end.
	- line 3641: // TODO: avoid hardcoded values
	- line 3675: // TODO: support bias in CNTK op.
	- line 3687: // TODO: this is experimental code to load Facebook Caffe models.


Source/CNTKv2LibraryDll/CompositeFunction.cpp (16 lines):
	- line 66: // TODO: same for BatchNorm
	- line 93: // TODO: same for BatchNorm
	- line 323: // TODO: all logging functionality should be refactored to live in a logging utility class.
	- line 510: // TODO: the following two lines are a workaround for a bug in the Math library
	- line 557: // TODO: Input variables currently are required to have the default batch axis
	- line 566: // TODO: Support inputs with > 1 dynamic axes
	- line 1709: // TODO: We should either invalidate and readapt the network if the backpropRoots change compared to what was specified when the network
	- line 1716: // TODO: Support changing the device across different invocations of the forward method on a Function instance
	- line 1749: // TODO: We currently only support one backprop root
	- line 1996: // TODO: The shape of the specified output Value object must match the actual output shape
	- line 2132: // TODO: We need a better way to determine the ElementType for the network
	- line 2196: // TODO: Avoid copying the data when possible
	- line 2256: // TODO: How to deal with the specified 'computeDevice'
	- line 2275: // TODO: Support multiple concurrent backprop states
	- line 2286: // TODO: Avoid copying the data when possible
	- line 2310: // TODO: How to deal with the specified 'computeDevice'


Source/Math/GPUMatrix.cu (16 lines):
	- line 399: // TODO: enable UVA p2p access once this is fixed.
	- line 577: // TODO: This should be in the storage object.
	- line 664: if (m_numRows * numCols > 0) // TODO: remove if unnecessary
	- line 1049: //    return; // actually a failure  --TODO: This should not be necessary. Why is it?
	- line 1346: CreateCurandObject(seed, __FUNCTION__); // TODO call ResetCurandObject() instead?
	- line 1426: CreateCurandObject(seed, __FUNCTION__); // TODO call ResetCurandObject() instead?
	- line 1436: CreateCurandObject(seed, __FUNCTION__); // TODO call ResetCurandObject() instead?
	- line 1467: CUDA_CALL(cudaEventCreate(&done)); // TODO: why not condition on do_sync, so that we can use SyncGuard?
	- line 2220: UNUSED(c); // TODO: this function seems like a stub
	- line 3450: // TODO expAvgFactor == 0 && blendFactor == 1 can be optimized (no need for update).
	- line 4749: // TODO: change all three '512' to 'GridDim::maxThreadsPerBlock' (not doing this now since I cannot test it)
	- line 4796: // TODO: change all three '512' to 'GridDim::maxThreadsPerBlock' (not doing this now since I cannot test it)
	- line 4821: // TODO: Use this to implement ComputationNode::ConstOnes? Or do we even need that anymore?
	- line 4869: // TODO: We should observe if these actually make a speed difference, and if not, remove these special cases.
	- line 4901: // TODO: Add a special case for tensor bias reduction. cudnn is ~7% faster on Image/QuickE2E.
	- line 5088: // TODO: This is duplicated in BestGpu.cpp


Source/Common/Include/Sequences.h (15 lines):
	- line 85: // Towards nested loops:  --TODO: implement this
	- line 96: UniqueSequenceId seqId; // unique sequence id (or GAP_SEQUENCE_ID--TODO: don't include gaps here)
	- line 237: // TODO: Should we use a proper priority_queue?
	- line 486: // TODO: What are our sorted-ness guarantees?
	- line 636: // TODO: We actually just need a boolean matrix for this.
	- line 657: // special accessor for sequence training  --TODO: must be replaced by a different mechanism
	- line 690: // TODO: This will in the future be able to hold sub-ranges for nested loops as well.
	- line 695: public:                       // TODO: make private (currently used from masking and DataFor) ; TODO: rename all members with m_ prefix
	- line 699: size_t seqIndex;          // parallel-sequence index; SIZE_MAX = all sequences in MB (most common case)  --TODO: Bad name, 'sequence' and 'parallel sequence' are two different things
	- line 702: const FrameRange *parent; // or NULL: parent range, relative to which this FrameRange is interpreted  --TODO: not used yet
	- line 940: // TODO: Remove this version (with sanity checks) after this has been tested. Then the function can be inlined above.
	- line 946: // TODO: Can probably be faster by using the sequence array directly.
	- line 947: // TODO: Or should we just blast m_distanceToStart to GPU, and maks based on that? It is small compared to features.
	- line 958: // TODO: This can be done more efficiently by using m_sequences[].
	- line 1207: size_t sequenceDim = shape.size() - 2; // (only valid if pMBLayout)  --TODO: In case of multiple time dims, this must be adjusted.


Source/ComputationNetworkLib/RecurrentNodes.cpp (15 lines):
	- line 128: // TODO: Switch to a vector instead of an unordered_map
	- line 182: // TODO: in forward, we don't actually care if we propagate into a gap; could avoid a few unnecessary conditional copies towards the end
	- line 222: // TODO: move this to the MBLayout where this can be done together with the creation of the other mask and is likely to further improve performance.
	- line 287: // TODO: Are we sharing memory correctly? (no big deal as these are small; yet would be nice)
	- line 316: // TODO: this should be a bulk operation; this implementation is a quick hack
	- line 413: // TODO: Can we optimize this and only copy if there is a sequence spanning across the end of the MB? And add a check to BeginForwardProp() to make sure we got one if there is a boundary at the start?
	- line 440: // This will drag along the gaps as well, hence we mask them to zero above. --TODO : this is not optimal.
	- line 459: // TODO: this should be a bulk operation; this implementation is a quick hack
	- line 472: // TODO: change below accesses to TensorView, then this is no longer needed. This is now the case, but need to test it.
	- line 473: // TODO: we seem to already use TensorView, so this thing may no longer be needed. Too scary to remove.
	- line 678: // TODO (this is still unfinished):
	- line 715: // (TODO: We could force-evaluate the boundary input here.)
	- line 747: // TODO: If we have a truncated-BPTT state then verify that the sequence indices match with m_state->m_sequences, and the tensor dimensions.
	- line 1042: if (fr.GetIterationDimension() != m_shiftDimParam) // TODO: this was removed; GetIterationDimension() is always -1 now
	- line 1247: virtual NodeStatePtr ExportState() // TODO: can we instead pass the shared_ptr object in? So we don't need to create a new one all the time? Or should we still take ownership of the ptr?


Source/ComputationNetworkLib/LinearAlgebraNodes.h (15 lines):
	- line 375: // TODO: Implement this with TensorView::DoElementwiseProductOf() and stride magic
	- line 376: // TODO: Transpose flags for all matrices, inputs and outputs?
	- line 377: // TODO: allow outputRank < 0 meaning to denote "all but", from right
	- line 675: // TODO: better move this special-casing into TensorView::AssignElementwiseProductOf()
	- line 747: // TODO: better move this special-casing into TensorView::AssignElementwiseProductOf()
	- line 1183: // TODO support multiplication on GPUs as well.
	- line 1275: // TODO: change to TensorView and AssignCopyOf() with reduction
	- line 1483: // TODO: Would it be useful to allow one of the two to be a single column?
	- line 1484: // TODO: Allow to reduce only over a single dimension, or a subset.
	- line 1548: // TODO: We could do something more interesting with tensors.
	- line 1604: // TODO: This is a special kind of tensor product, and calls for a tensor representation.
	- line 1665: // TODO: ^^ Is that correct? Should we use a tensor here, TensorShape(rows0, rows1)?
	- line 1715: /*TODO: merge with call site*/ void BackpropToS(const size_t inputIndex, const Matrix<ElemType>& invNorm0, const Matrix<ElemType>& invNorm1, const Matrix<ElemType>& functionValues,
	- line 1821: /*TODO: merge with call site*/ void ForwardPropS(Matrix<ElemType>& invNorm0, Matrix<ElemType>& invNorm1, Matrix<ElemType>& functionValues, Matrix<ElemType>& in0, Matrix<ElemType>& in1, Matrix<ElemType>& in2, Matrix<ElemType>& in3, Matrix<ElemType>& leftTermTemp, Matrix<ElemType>& rightTermTemp)
	- line 1865: // TODO: This calls for a tensor representation!


Source/Readers/LUSequenceReader/LUSequenceReader.cpp (14 lines):
	- line 72: strtmp = trim(strtmp); // TODO: operates in-place, so no need to re-assign to itself
	- line 121: lblInfo.m_classInfoLocal->SetValue(0); // TODO: needed? (left-over of refactoring)
	- line 124: // TODO: Can it ever be not on the CPU? We allocate it ourselves abovew
	- line 161: return true; // TODO: what's this return value for?
	- line 204: labelId = found->second; // TODO: This function is called Check...() but it does Get something. Bad name?
	- line 440: else if (EqualCI(randomizeString, "auto") || EqualCI(randomizeString, "true"))  // TODO: "true" is inconsistent here, should be deprecated
	- line 445: // TODO: fail on invalid
	- line 826: // TODO: Why is this allowed? Why not terminate?
	- line 842: if (actualmbsize > m_mbSize * mToProcess.size()) // TODO: is this a LogicError?
	- line 854: // loop through all the samples and create a one-hot representation, or multi-hot in some conditions (TODO: which condition)
	- line 885: assert(idx == (LabelIdType) NULLLABEL); // TODO: what other conditions?
	- line 1030: bool BatchLUSequenceReader<ElemType>::CanReadFor(wstring nodeName) // TODO: const wstring &
	- line 1182: for (auto iter = matrices.begin(); iter != matrices.end(); iter++) // TODO: range-based for
	- line 1185: for (typename map<wstring, BatchLUSequenceReader<ElemType>*>::iterator p = mReader.begin(); p != mReader.end(); p++) // TODO: range-based for


Source/Readers/LMSequenceReader/SequenceParser.h (14 lines):
	- line 551: if (mFile)  // TODO: Can this function be called multiple times? Then say so at the top function
	- line 554: // TODO: use our File class, so that we get the benefit of popen()
	- line 555: if (_wfopen_s(&mFile, fileName, L"rt") != 0)    // TODO: What does this warning do? Why not fail?
	- line 561: if (mFile) // TODO: can this ever be called without an open file?
	- line 565: // Parse - Parse the data   --TODO: is this doing the whole file or incrementally?
	- line 567: // labels - pointer to vector to return the labels   --TODO: change to reference
	- line 568: // numbers - pointer to vector to return the numbers    --TODO: what the hell are those numbers?
	- line 584: char ch2[MAXSTRING]; // TODO: This is 0.5 MB right here on the stack. Really?
	- line 586: if (mFile == nullptr) // TODO: why check here and not when it is being opened?
	- line 594: if (vstr.size() < 3)    // TODO: Document this special condition. Why should we not process empty sequences like <s> </s>?
	- line 599: labels->push_back(std::move(vstr[i])); // TODO: is this an entire sequence, or multiple columns describing a single token?
	- line 614: return (long) lineCount;    // TODO: change to size_t
	- line 621: size_t sLen;    // TODO: say what these are
	- line 646: //   TODO: can return value be negative? If not, use size_t


Source/Math/CPUSparseMatrix.cpp (14 lines):
	- line 50: // TODO: Move to CommonMatrix.h
	- line 361: RequireSizeAndAllocate(v.GetNumRows(), v.GetNumCols(), v.NzCount() ); // TODO: rename to *Bytes/*Count instead of vague *Size if possible
	- line 379: memcpy(GetBlockIds(), v.GetBlockIds(), v.GetBlockSize() * sizeof(size_t)); // TODO: change block id from size_t to CPUSPARSE_INDEX_TYPE, and rename BlockSize to BlockCount
	- line 461: // TODO: Does it make sense to parallelize this?
	- line 478: // TODO: Does it make sense to parallelize this?
	- line 535: // TODO: Replace with std::exclusive_scan when we switch to C++17
	- line 540: // TODO: Does it make sense to parallelize this?
	- line 784: // TODO: add keepExistingValues (default to true) argument so that the existing values are kept even after reallocation
	- line 822: // TODO: This is super ugly. The internals of the storage object should be a shared_ptr.
	- line 979: // TODO: Implement CSR as a transposition of b, like we do for GPU.
	- line 1376: // TODO: NormalGrad is a misnomer here. Come up with a better name.
	- line 1506: // TODO: Unroll 4-times for better performance leveraging vectorization
	- line 1702: ElemType v = 0; // TODO: do this in 'double'?
	- line 1747: ElemType sum = 0; // TODO: Do this in 'double'?


Source/SequenceTrainingLib/latticeforwardbackward.cpp (13 lines):
	- line 501: { // ^^ TODO: remove this
	- line 514: // TODO: these are return values as well, but really shouldn't anymore; only used in some older baseline code we some day may want to compare against
	- line 710: // TODO: we will later have code that adds this path if needed
	- line 828: // TODO: this is not efficient--we only use a block-diagonal-like structure, rest is empty (exploiting the fixed boundaries)
	- line 1090: const auto &hmm = hset.gethmm(unit.unit); // TODO: inline these expressions
	- line 1125: // TODO: count VIRGINLOGZERO, print per frame
	- line 1139: // ... TODO: we don't need this to be a class member, actually; try to just make it a 'static' function.
	- line 1143: if (transcript[0].firstframe != 0) // TODO: should we store the #frames instead? Then we can validate the total duration
	- line 1233: // TODO:
	- line 1329: // TODO: fix this comment
	- line 1350: // TODO: the following checks should throw, but I don't dare in case this will crash a critical job... if we never see this warning, then
	- line 1365: // TODO: no longer used, remove this. 'transcript' parameter is no longer used in this function.
	- line 1388: double totalfwscore = 0; // TODO: name no longer precise in sMBRmode


Source/Common/Include/Basics.h (13 lines):
	- line 28: #define TWO_PI 6.283185307f // TODO: find the official standards-confirming definition of this and use it instead
	- line 82: #ifndef _MSC_VER // TODO: what is the correct trigger for gcc?
	- line 175: { // TODO: rename this
	- line 284: // TODO: This does not seem to work well, most places use wtocharpath() instead. Maybe we can remove this.
	- line 368: // TODO: merge this with todouble(const char*) above
	- line 427: // TODO: switch all uses if isspace() etc. to this once tested well
	- line 513: // TODO: Copy Constructor
	- line 516: // TODO: Move Constructor
	- line 519: // TODO: Assignment operator
	- line 522: // TODO: Move assignment operator
	- line 554: // TODO: maybe change to type id of an actual thing we pass in
	- line 555: // TODO: is this header appropriate?
	- line 563: // dynamic loading of modules  --TODO: not Basics, should move to its own header


Source/Math/Matrix.h (13 lines):
	- line 5: // TODO:
	- line 64: // TODO: Move more generic functions such as getting dims, resizing, and getting/setting as scalars in here.
	- line 91: mutable int m_devicesTransferedTo[2]; // TODO: what is this for? Seems only diagnostics
	- line 114: // TODO: Rewrite this constructor to eliminate the external buffers flag. Make a separate construction mechanism for Matrix objects that don't own their storage.
	- line 135: static void SetDevice(DEVICEID_TYPE deviceId); // TODO: unify with PrepareDevice()
	- line 229: void Resize(const Matrix<ElemType>& other) // TODO: Should this carry over numNZElemToReserve for sparse matrices?
	- line 243: // TODO: Call this ShallowClone instead?
	- line 257: // TODO: a future version may want to enforce retaining the content, to allow dynamically growing layouts column by column (when size is not known upfront)
	- line 363: // TODO: There are several functions below that perform an in-place operation
	- line 453: // TODO: rename these to InPlaceFloor() and -Ceil() (I never know what it means to truncate a bottom)
	- line 490: Matrix<ElemType>& AssignVectorNorm1Of(Matrix<ElemType>& a, const bool isColWise); // TODO: arg should be const
	- line 493: Matrix<ElemType>& AssignVectorNorm2Of(Matrix<ElemType>& a, const bool isColWise); // TODO: arg should be const
	- line 599: // TODO: why are these not static? And why are they here?


Source/Common/fileutil.cpp (13 lines):
	- line 11: #pragma warning(disable : 4996)   // ^^ this does not seem to work--TODO: make it work
	- line 495: #else // TODO: test this
	- line 599: // TODO: if we only skip a limited number of bytes, fread() them
	- line 867: // TODO: we should redefine this to write UTF-16 (which matters on GCC which defines wchar_t as 32 bit)
	- line 994: // TODO: we should redefine this to write UTF-16 (which matters on GCC which defines wchar_t as 32 bit)
	- line 1015: // TODO: we should redefine this to write UTF-16 (which matters on GCC which defines wchar_t as 32 bit)
	- line 1045: // TODO: we should redefine this to write UTF-16 (which matters on GCC which defines wchar_t as 32 bit)
	- line 1085: // TODO: we should redefine this to write UTF-16 (which matters on GCC which defines wchar_t as 32 bit)
	- line 1145: // ...TODO: eat trailing space like fscanf() doessurrounding space)
	- line 1162: // ... TODO: while (IsWhiteSpace (c)) c = fgetc (f);      // skip trailing space
	- line 1183: // TODO: we should redefine this to write UTF-16 (which matters on GCC which defines wchar_t as 32 bit)
	- line 1198: // ... TODO: while (IsWhiteSpace (c)) c = fgetc (f);      // skip trailing space
	- line 1827: #else // TODO: test this; e.g. does st_mtime have the desired resolution?


Source/ComputationNetworkLib/SpecialPurposeNodes.h (12 lines):
	- line 114: Matrix<ElemType> slicePrior = DataFor(*m_prior, fr); // TODO: use the right MBLayout, then we won't need the special case
	- line 287: /*TODO: merge with call site*/ void ForwardPropS(Matrix<ElemType>& functionValues, const Matrix<ElemType>& unnormedPrior, const Matrix<ElemType>& mean, Matrix<ElemType>& logstddev,
	- line 546: // TODO: Would this lend itself to a unique_ptr instead of the init flag?
	- line 635: // TODO: method names should be CamelCase
	- line 669: double m_fsSmoothingWeight; // frame-sequence criterion interpolation weight    --TODO: can this be done outside?
	- line 684: unsigned long long m_gammatime; // TODO: what are these? Not even the context can be guessed from these names.
	- line 963: // TODO: Rename to CustomCriterionNode?
	- line 1262: // TODO: This could be more easily implemented as a unary operation, like PassNode.
	- line 1281: // TODO:@Amit Due to current limitation of the network builder, we can't bypass the memory copy operation at this step.
	- line 1394: // TODO: We currently only support external nodes that cannot be part of CNTK recurrent loops
	- line 1411: // TODO: We should avoid this copy but that requires carefully managing the
	- line 1421: // TODO: We should avoid this copy but that requires carefully managing the


Source/ActionsLib/SimpleNetworkBuilder.h (12 lines):
	- line 14: // TODO: giving up moving stuff for now, running out of time. The following #includes should not be necessary once the hard-working code in here gets moved to .cpp
	- line 23: using namespace std; // TODO: ugh!
	- line 51: enum class TrainingCriterion : int // TODO: camel-case these
	- line 144: m_defaultHiddenActivity = config("defaultHiddenActivity", "0.1"); // TODO: spelling, should be -Activation
	- line 145: ConfigArray str_rnnType = config("rnnType", L"SIMPLENET"); // TODO: camelCase
	- line 148: m_lookupTableOrder = config("lookupTableOrder", "0"); // TODO: What is this?
	- line 172: // TODO: use EqualCI(), and use camelCase, e.g. classLSTM
	- line 174: if (std::find(strType.begin(), strType.end(), L"SIMPLENET") != strType.end()) // TODO: camelCase
	- line 176: else if (std::find(strType.begin(), strType.end(), L"SIMPLERNN") != strType.end()) // TODO: camelCase
	- line 180: else if (std::find(strType.begin(), strType.end(), L"CLASSLM") != strType.end()) // TODO: camelCase
	- line 186: else if (std::find(strType.begin(), strType.end(), L"CLASSLSTM") != strType.end()) // TODO: camelCase
	- line 206: ConfigArray layerTypes = config("layerTypes", L"Sigmoid"); // TODO: camelCase


Source/Readers/Kaldi2Reader/ssematrix.h (12 lines):
	- line 291: // TODO: We should use memset(), but that only works if there are no extra rows (in a patch). Do we even allow non-stripe patches? I don't remember... CUDA lib does.
	- line 297: } // TODO: later use memset()
	- line 628: const size_t j14 = j1 & ~3;          // ... TODO: put this back--when stuff works again
	- line 661: // TODO: do that inside the loop to avoid copying, but one thing at a time
	- line 691: // ... TODO: put a resize() here and all matmul, so we don't need to set size upfront
	- line 707: // This is a weird interface, as it makes also sense for a matrix. TODO: Fix this.
	- line 718: // This is a weird interface, as it makes also sense for a matrix. TODO: Fix this.
	- line 963: us(i,0) += other(i,t) * weight; // TODO: SSE version (very easy)
	- line 1354: // ... TODO: should this be moved into the base class? no need for separate type, just have a stripe() function just like col()
	- line 1412: #if 0 // TODO: move to separate header file numahelpers.h
	- line 1720: // (helper for qsort() in printmatvaluedistributionf() below --TODO: use a lambda?)
	- line 1922: // TODO: This type conflicts with std::vector --we should rename it


Source/SGDLib/DataReaderHelpers.h (12 lines):
	- line 22: // TODO: This is a stopgap. SGD will at some point change from sets of matrices to sets of nodes. Then this will become much simpler.
	- line 40: // TODO: callers of this often do ComputationNetwork::BumpEvalTimeStamp(featureNodes) and also for labels; we should eliminate the need for this.
	- line 66: // TODO: This should not need to be called in case of wasDataRead == false, since in that case, returned values are invalid.
	- line 78: // TODO: move this into shim for the old readers.
	- line 89: // TODO: This must be a runtime check, not an assert().
	- line 107: // TODO: This must be a runtime check, not an assert().
	- line 155: en = en > numParallelSequences ? numParallelSequences : en; // TODO: why are these two tests necessary? We should rather test rank
	- line 248: // TODO: The following calculation relies on the ill-devised definition of "minibatch" of the current truncated BPTT implementation. Adapt this once fixed.
	- line 263: // TODO: Can this just exist inside SGD.cpp?
	- line 334: continue; // already in the list  --TODO: use insert()
	- line 466: shared_ptr<ComputationNode<ElemType>> pLearnableNode = node; // TODO: what's this for?
	- line 567: // TODO: encapsulate it into a destructor? Note: Cannot throw exceptions in destructor.


Source/Math/GPUSparseMatrix.cu (11 lines):
	- line 149: // TODO: to copy other variables used only for class based LM
	- line 408: // TODO: enable UVA p2p access once this is fixed.
	- line 580: // TODO: We could do this on the GPU, but for now C++ is easier.
	- line 698: // TODO: add keepExistingValues (default to true) argument so that the existing values are kept even after reallocation
	- line 940: // TODO: All RequireSizeAndAllocate should be async and use a transferer.
	- line 1480: // TODO: NormalGrad is a misnomer here. Come up with a better name.
	- line 2354: // TODO: This is an unusual use of this operator. Remove this.
	- line 2363: // TODO: This is an unusual use of this operator. Remove this.
	- line 2546: // TODO: Implement optimized diagonal functions for sparse matrices. For now copy to dense first.
	- line 2798: // TODO: Check whether these functions always map 0 to 0.
	- line 2897: //TODO: because sparse setting value on non-zero sparse matrix involves


Source/Common/Include/latticearchive.h (11 lines):
	- line 203: std::vector<edgeinfo> edges2;                 // TODO: rename these
	- line 217: public: // TODO: make private again once
	- line 341: // TODO: be more consistent--we should clear out edges[] at this point!
	- line 528: unsigned int firstframe : 16;                         // TODO: obsolete; once removed, we are back at 32 bits--yay
	- line 576: alignoffsets[L.edges.size()] = (unsigned int) alignbufsize; // (TODO: remove if not actually needed)
	- line 619: std::vector<size_t> backptroffsets;         // TODO: we could change this to 'unsigned int' to save some transfer time
	- line 672: backptroffsets[L.edges.size()] = backptrbufsize; // (TODO: remove if not actually needed)
	- line 1123: // reconstruct old lattice format from this   --TODO: remove once we change to new data representation
	- line 1261: if (idmap.empty()) // TODO: delete this: && !modelsymmap.empty()/*no mapping; used in conversion*/)
	- line 1347: // TODO: should we allow paths relative to TOC file?
	- line 1375: #if 0 // TODO: change design to keep the #frames in the TOC, so we can check for mismatches before entering the training iteration


bindings/python/cntk/ops/functions.py (11 lines):
	- line 135: # TODO: bring this back once we have a design for name-accessible .outputs etc.
	- line 138: #        for kw in kwargs: # TODO: only allow one arg
	- line 203: # resolve tuples and NamedOutputs  --TODO: check for duplicates
	- line 207: # ^^ TODO: Complete the design for name-accessible .outputs, then bring this back.
	- line 210: # TODO: ^^ is this still necessary? Or is this a sanitize() call we need here?
	- line 331: # TODO: Should clone-replacing inputs with placeholders reset the shapes to unknown?
	- line 335: # TODO: add tests for this complex condition
	- line 439: # TODO: remove the parallel application; instead
	- line 1260: # TODO have a better name for combine() in this case
	- line 1303: # TODO: If this is of general interest, consider to move it to progress_print.py
	- line 1811: # FIXME: seq_starts


Source/Math/CPUMatrixImpl.h (11 lines):
	- line 224: // TODO: why not say *this = ColumnSlice()?
	- line 657: #pragma omp parallel for // TODO: Depending in circumstance, it may be more efficient to parallelize over rows.
	- line 1288: // TODO: Unroll 4-times for better performance leveraging vectorization
	- line 1336: // TODO: Unroll 4-times for better performance leveraging vectorization
	- line 1476: // TODO: Unroll 4-times for better performance leveraging vectorization
	- line 1568: // TODO: change to use STL vector instead
	- line 2036: // TODO: This clips the divisor by a small value. Is that really what one would want?
	- line 3615: #ifdef __INTEL_COMPILER // TODO: check this
	- line 3667: #ifdef __INTEL_COMPILER // TODO: check this
	- line 3691: #ifdef __INTEL_COMPILER // TODO: check this
	- line 5216: // TODO: support transpose product


Source/Readers/HTKDeserializers/HTKMLFReader.cpp (11 lines):
	- line 41: // TODO: should we make this explicit configuration parameter
	- line 66: // TODO: deserializers and transformers will be dynamically loaded
	- line 114: // TODO: this should be bool. Change when config per deserializer is allowed.
	- line 140: // TODO: Currently BPTT does not support sparse format as output.
	- line 149: // TODO: should we unify sample and sequence mode packers into a single one.
	- line 150: // TODO: functionally they are the same, the only difference is how we handle
	- line 151: // TODO: MBlayout and what is the perf hit for iterating/copying sequences.
	- line 152: // TODO: Should do more perf tests before unifying these two.
	- line 154: // TODO: As the next step the packers will be moved out of the readers into the
	- line 155: // TODO: core CNTK. They are format agnostic and can be used with any type of
	- line 156: // TODO: deserializers.


Source/Common/Include/ssematrix.h (11 lines):
	- line 297: // TODO: We should use memset(), but that only works if there are no extra rows (in a patch). Do we even allow non-stripe patches? I don't remember... CUDA lib does.
	- line 303: } // TODO: later use memset()
	- line 636: const size_t j14 = j1 & ~3;          // ... TODO: put this back--when stuff works again
	- line 669: // TODO: do that inside the loop to avoid copying, but one thing at a time
	- line 699: // ... TODO: put a resize() here and all matmul, so we don't need to set size upfront
	- line 715: // This is a weird interface, as it makes also sense for a matrix. TODO: Fix this.
	- line 726: // This is a weird interface, as it makes also sense for a matrix. TODO: Fix this.
	- line 1299: // ... TODO: should this be moved into the base class? no need for separate type, just have a stripe() function just like col()
	- line 1570: // TODO: should this be a function template?
	- line 1651: // (helper for qsort() in printmatvaluedistributionf() below --TODO: use a lambda?)
	- line 1853: // TODO: This type conflicts with std::vector --we should rename it


bindings/python/cntk/ops/__init__.py (10 lines):
	- line 614: # TODO: running_count should be right after running_inv_std; no need for upwards compat
	- line 2377: # TODO: does this belong into .sequence?
	- line 2420: # FIXME figure out how to only SKIP the doctest in CPU
	- line 2434: # TODO: enable when it is exposed in c++
	- line 2745: # TODO: enable when it is exposed in c++
	- line 3047: # TODO: rename V2 API function as well from reduce_log_sum() to reduce_log_sum_exp()
	- line 3620: TODO: Investigate to remove it.
	- line 3680: # TODO dynamic axis for numpy arrays
	- line 3681: # TODO sparse for numpy arrays
	- line 3808: # TODO: ComputeInputPerDimMeansAndInvStdDevs


Source/ActionsLib/SimpleNetworkBuilder.cpp (10 lines):
	- line 202: // TODO: to figure out sparse matrix size
	- line 223: // TODO: Why the ^^ namespace?
	- line 234: // TODO: to figure out sparse matrix size
	- line 559: // TODO: to figure out sparse matrix size
	- line 669: // TODO: to figure out sparse matrix size
	- line 672: // TODO: to figure out sparse matrix size
	- line 675: // TODO: to figure out sparse matrix size
	- line 678: // TODO: to figure out sparse matrix size
	- line 681: // TODO: to figure out sparse matrix size
	- line 741: // TODO: to figure out sparse matrix size


Source/CNTKv2LibraryDll/Value.cpp (10 lines):
	- line 342: //TODO: avoid data copy.
	- line 355: //TODO: avoid data copy.
	- line 388: // TODO: Check if this is a derived type and throw an exception in that case
	- line 394: // TODO: Check if this is a derived type and throw an exception in that case
	- line 400: // TODO: Check if this is a derived type and throw an exception in that case
	- line 406: // TODO: Check if this is a derived type and throw an exception in that case
	- line 412: // TODO: Check if this is a derived type and throw an exception in that case
	- line 518: // TODO: leverage sparse if the original NDArrayView is in spase.
	- line 524: // TODO: direct process sparse data without copy
	- line 552: // TODO: if function pointer or lambda could support template, switch to use them.


Source/SGDLib/SGD.h (10 lines):
	- line 24: using namespace std; // ugh! TODO: get rid of this from .h files!!!
	- line 130: // TODO: This should keep everything that is configured by the config.
	- line 172: // TODO: This ^^ should go away once SGD gets fixed to take the truncation size as a parameter.
	- line 199: // bool m_needToNormalizeLRByParallUtterance;          // TODO: should go away
	- line 207: // TODO: do not specify 'Truncated' but 'TruncatedLength', set m_truncated so given, and let m_mbSize control how many #parallel sequences the reader is allowed to pack into an MB.
	- line 359: // TODO: make this independent of ElemType. Then these repeated dynamic_pointer_casts will go away
	- line 360: // TODO: why is this a class, and not just a procedure? Then we wouldn't have to include the massive header
	- line 373: // TODO: The next few do not belong into SGD any more than the network or reader we operate on. Either move network and reader in here, or move these out.
	- line 507: // TODO: move the two-forward-pass support out of the reader.
	- line 566: void SaveCheckPointInfo(const size_t epoch, const size_t totalSamplesSeen, // TODO: combine totalSamplesSeen and prevCriterion into a EpochCriterion type


Source/ComputationNetworkLib/InputAndParamNodes.cpp (10 lines):
	- line 22: // TODO: add -Node to the class name
	- line 49: //  - initValue=array or nested array --> initialize from this value, infer dimensions  --TODO: not implemented yet
	- line 60: // The forms that infer the dimensions have different BrainScript names. TODO: need one for fromFile
	- line 61: // TODO: All forms that require specified dimensions but contain zeroes (to be updated by graph)
	- line 105: // TODO: add more randomization types, and use a more meaningful scaling
	- line 447: //         Currently, this would cause a matrix/tensor dimension mismatch. --TODO: Is this comment up-to-date?
	- line 449: // TODO: Get rid of that const_cast, as soon as after Ryan's Matrix-lib refactoring separated out SetValue() from external vs. from deep copy
	- line 453: // TODO: Move this error check there, since this is called only from one place.
	- line 569: // TODO: Actually this should never be needed, because each time dimensions change, we init.
	- line 684: sprintf(str, "learningRateMultiplier=%f  NeedsGradient=%s", m_learningRateMultiplier, m_learningRateMultiplier>0 ? "true" : "false"); // TODO: update NDL to accept a better matching name as well


Source/Readers/HTKMLFReader/HTKMLFReader.cpp (10 lines):
	- line 81: m_pMBLayout->Init(m_numSeqsPerMB, 0); // (SGD will ask before entering actual reading --TODO: This is hacky.)
	- line 327: else                                        randomize = readerConfig(L"randomize"); // TODO: could this not just be randomizeString?
	- line 383: // TODO: when gcc -v is 4.9 or greater, this should be: std::regex_replace(rootpath, L"\\/+$", wstring());
	- line 474: // TODO: when gcc -v is 4.9 or greater, this should be: regex_replace((wstring)ppath, wregex(L"\\.[^\\.\\\\/:]*$"), wstring());
	- line 609: // TODO: lots of code dup with the other Prepare function
	- line 737: m_pMBLayout->Init(m_numSeqsPerMB, 0); // (SGD will ask before entering actual reading --TODO: This is hacky.)
	- line 777: m_pMBLayout->Init(requestedMBSize, 0); // (SGD will ask before entering actual reading --TODO: This is hacky.)
	- line 937: // TODO: Why do we have two read functions? Is one not a superset of the other?
	- line 1476: // TODO: This should use DataFor(). But for that, DataFor() will have to move out from ComputationNode. Ah, it has!
	- line 1621: Matrix<ElemType>& data = matrices.GetInputMatrix<ElemType>(iter2->first); // can be features or labels   (TODO: Really? Didn't we just ^^^ check that it is 'real'?)


Source/Readers/HTKMLFReader/utterancesourcemulti.h (10 lines):
	- line 722: // TODO: Use std::lower_bound
	- line 804: // TODO: this may go away if we store classids directly in the utterance data
	- line 823: // TODO: the following is not templated--do it if needed; also should return a const reference then
	- line 1056: // TODO: also check the #frames here; requires a design change of the TOC format & a rerun
	- line 1069: // TODO: we can store labels more efficiently now since we don't do frame-wise random access anymore.
	- line 1208: // TODO: above push_back does not actually 'move' because the internal push_back does not accept that
	- line 1270: foreach_index (i, allchunks) // TODO: this cries for iterating using the iterator!
	- line 1351: foreach_index (k, randomizedchunks[0]) // TODO: this really cries for iterating using iterators!
	- line 1785: // TODO: No, return all; and leave it to caller to redistribute them [Zhijie Yan]
	- line 1792: size_t j = subsetsizes[subsetnum];                                           // return what we have  --TODO: we can remove the above full computation again now


Source/ComputationNetworkLib/ComputationNetwork.cpp (10 lines):
	- line 23: #include "MPIWrapper.h" // TODO: does not belong here
	- line 123: // TODO: how does the file distinguish float vs double nodes?
	- line 253: // TODO: Why not just reload it? Because SGD::Train() holds pointers to the parameters directly? That should be fixed.
	- line 349: // TODO: how does the file distinguish float from double?
	- line 509: // TODO: just use return!
	- line 532: for (const auto& node : GetEvalOrder(rootNode)) // TODO: verify that order does not matter here, then replace by GetAllNodesForRoot()
	- line 648: // TODO: Change this to use an interface that is independent of <ElemType>.
	- line 979: // TODO: test whether that is true
	- line 1025: // TODO: Lift this into config language, move underlying code to math lib. This should be a model-editing operation.
	- line 1176: // TODO: We should be able to move instead of copy but it currently isn't straightforward


Source/Common/Include/TensorShape.h (9 lines):
	- line 391: for (size_t k = 0; k < m_dims.size(); k++) // (TODO: we can save one multiplication here)
	- line 399: // TODO: move the methods in this region under their respective headline
	- line 400: // TODO: overload the << and >> operators for serializing TensorShape
	- line 557: // TODO: rethink whether this is correct for example of negative strides
	- line 635: // TODO: How to do this right in case of arbitrary strides? Compute the new stride based on m_allocation or something? Is it even possible? Or do we need to guard?
	- line 790: m_allocation = m_dims.empty() ? 1 : m_dims.back() * (size_t) m_strides.back(); // TODO: Or should an empty shape mean it's a scalar?
	- line 824: // TODO: double-check all these
	- line 825: // TODO: Does the same trick work for 2D images?
	- line 861: // convenience accessors. TODO: use only one name. Rename the members themselves?


Source/Common/Include/Config.h (9 lines):
	- line 16: #pragma warning(disable : 4996) // Caused by the TODO below (line ~1280)
	- line 120: operator std::string() const { return *this; } // TODO: does not seem to work
	- line 286: // TODO: do we want to allow accept non-empty strings and non-0 numerical values as 'true'?
	- line 435: //    This is meant for the case where the entire string is a brace expression (TODO: is that true? [fseide]).
	- line 437: //    right after the brace, e.g. [- a - b] will separate using '-' instead of ';'. TODO: document what this is used for.
	- line 749: // hide new so only stack allocated   --TODO: Why do we care?
	- line 982: // TODO: unify with the Find() function below
	- line 1003: // TODO: What the hell is this?
	- line 1376: // TODO: left-over of Linux compat, can be done nicer


Source/Common/Include/DataReader.h (9 lines):
	- line 43: // TODO: Should be unified with StreamDescription from the new reader API
	- line 185: // TODO: Abstract this.
	- line 215: typedef unsigned int LabelIdType;   // input token mapped to an integer  --TODO: why not size_t? Does this save space?
	- line 276: // TODO: Should be removed when BPTT follows proper minibatch size.
	- line 316: virtual bool CanReadFor(wstring /* nodeName */) // return true if this reader can output for a node with name nodeName  --TODO: const wstring&
	- line 329: // TODO: move this out of the reader.
	- line 341: // TODO: move this out of the reader.
	- line 377: vector<wstring> m_ioNames;                          // TODO: why are these needed, why not loop over m_dataReaders?
	- line 474: // TODO: The return value if this is never used except in loops where we do an &=. It is not clear whether that is a bug or intentionally prevents DataEnd() from being called.


bindings/python/cntk/layers/sequence.py (9 lines):
	- line 59: # TODO: reenable this
	- line 71: # TODO: reconsider the name. Windowed()?
	- line 193: # TODO: allow to say sequential=False, axis=2, length=100, ... something like this
	- line 266: # TODO: better say right here what the requirement is!
	- line 330: # TODO: Can bidirectionality be an option of this? bidirectional=True?
	- line 452: # TODO: if initial_state is a CNTK Function rather than an initializer, then require to pass it multiple times; otherwise broadcast to all
	- line 556: # TODO: This API is still suboptimal, and should be fixed as follows:
	- line 635: # TODO: having to pass the dynamic axis is suboptimal. Any better way?
	- line 648: # TODO: must allow multiple variables, just like recurrence, as to allow beam decoding (permutation matrix)


Source/Readers/Kaldi2Reader/msra_mgram.h (8 lines):
	- line 28: // ... TODO (?): return true/false to indicate whether anything changed.
	- line 36: // ... TODO: ensure iterators do not return OOVs w.r.t. user symbol table
	- line 39: // ... TODO: change this to key() or something like this
	- line 76: // ... TODO: use the proper constants here (slightly inconsistent)
	- line 1161: // ^^ TODO: can we do away with this entirely and replace it by map.order()/this->order()
	- line 1172: // ... TODO: rethink the resize business. It is for shrinking only.
	- line 2407: // ... TODO: use a constant to define the maximum KN count level,
	- line 2686: // ... TODO: actually, is subtracting 1 the right thing to do here?


Source/CNTK/BrainScript/BrainScriptParser.cpp (8 lines):
	- line 320: eof // TODO: what are true and false? Literals or identifiers?
	- line 397: // TODO: also allow ... syntax, where ... refers to the directory of the enclosing file
	- line 403: // TODO: This is a little weird. Rather, this should be done by the call site.
	- line 407: // TODO: We should use the separator that matches the include path.
	- line 427: // TODO: need to know whether we want to see '\n' or not
	- line 671: // TODO: Would be more direct to fold this into the table below as well.
	- line 683: {L">>", 5}, {L"<<", 5}, // TODO: do it as other languages
	- line 827: // TODO: test parsing of i => j => i*j


Source/ComputationNetworkLib/InputAndParamNodes.h (8 lines):
	- line 33: // TODO: add -Node to the class name
	- line 227: // TODO: add -Node to the class names
	- line 269: // TODO This currently reads a ComputationNode object from a property, thereby bypassing "normal" input handling.
	- line 396: // TODO: There is still debate whether an InputValue without layout makes sense.
	- line 518: // TODO: Noone else overrides this method. So is this the right mechanism?
	- line 598: /*TODO: merge with call site*/ void BackpropToLeft(Matrix<ElemType>& inputFunctionValues, Matrix<ElemType>& inputGradientValues, Matrix<ElemType>& gradientValues)
	- line 613: /*TODO: merge with call site*/ void BackpropToRight(Matrix<ElemType>& inputFunctionValues, Matrix<ElemType>& inputGradientValues, Matrix<ElemType>& gradientValues)
	- line 661: // TODO: Should this add a tensor dimension?


Source/CNTKv2LibraryDll/Function.cpp (8 lines):
	- line 238: // TODO: Exclude inputs not belonging to 'gradients' from the gradient computation
	- line 597: // TODO: Make sure that the loaded model is the same as the trainer's model through UID matching in the V2 format
	- line 598: // TODO: For V1 format models make sure that the loaded model is isomorphic to the trainer's model
	- line 1068: //TODO (backcompat): when loading a stale model we can still pass this test
	- line 1321: // TODO: needs numerically stable implementation.
	- line 1715: // TODO: this code is needed for ONNX converter because ONNX requires squeeze axis. However, unit test failed with this code.
	- line 2835: // TODO: If the condition is a scalar constant, we can just pass-through the appropriate operand
	- line 3031: // TODO: In V1 graph generation, ReconcileDynamicAxis() should be treated like a no-op if the axis is known to be the same.


Source/CNTKv2LibraryDll/PrimitiveFunction.cpp (7 lines):
	- line 320: // TODO: We currently only support input operand with 1 dynamic axis for PastValue/FutureValue
	- line 1019: //TODO for very far future: Handle reduction on (multiple) batches all in once: batchAxesToReduce
	- line 1020: //TODO for very far future: Handle reduction on (multiple) sequences all in once: sequenceAxesToReduce
	- line 1046: // inherit tensor dimension from sourceData, minus the last (column or time) dimension. TODO this needs to become simpler...
	- line 1428: // TODO: all logging functionality should be refactored to live in a logging utility class.
	- line 1500: // TODO: Should we do this for all of the axes in kernelShape that have a dimensionailty of NDShape::InferredDimension?
	- line 1563: // TODO: Is this logic of transitively constructing the output shape from the operands correct?


Source/SequenceTrainingLib/parallelforwardbackward.cpp (7 lines):
	- line 205: // TODO: This function is about determining the parallelization layout
	- line 229: Eframescorrectbuf; // TODO: remove this [v-hansu]
	- line 433: // TODO: this can only be cached once --but there is no check whether a different model is passed
	- line 711: } // pass models in (to GPU) // TODO: rethink the naming of this function
	- line 734: // TODO: Overload to enable compilation for DoublePrecision though its currently unsupported
	- line 750: // TODO: Overload to enable compilation for DoublePrecision though its currently unsupported
	- line 811: {                                     // ^^ TODO: remove this


bindings/python/cntk/layers/blocks.py (7 lines):
	- line 34: #  - maps init_default_override_or_glorot_uniform to default  --TODO: we should have a global setting for that
	- line 36: # TODO: remove default resolution, only make this a conversion; then rename
	- line 46: #return init # TODO: change to this once this works, e.g. for layers.BatchNormalization()
	- line 62: return Constant(0) # note: don't pass None to past_value, because that would default to float32 --TODO: still the case?
	- line 116: # TODO: ^^ should no longer be needed; delete once confirmed
	- line 280: # TODO: should both activations be replaced?
	- line 329: # h(t) = (1 - i(t) .* h'(t)) + i(t) .* h(t-1)                     --TODO: need to confirm bracketing with NVIDIA


Source/CNTKv2LibraryDll/Learner.cpp (7 lines):
	- line 321: // TODO: make this a runtime parameter.
	- line 413: // TODO: should we also save momentum schedule into the checkpoint?
	- line 424: //TODO: additional options are not serialized. This was not done when AdditionalOption was introduced.
	- line 463: // TODO: which learning rate schedule should take precedence here?
	- line 510: //TODO: additional options are not deserialized. This was not done when AdditionalOption was introduced.
	- line 557: //TODO: The unit gain term (1-beta) should stay as it is (currentMomentum) instead of using the following scaled term.
	- line 769: // TODO: consider exposing this somehow so that it is easy to test by setting it to small value.


Source/Readers/Kaldi2Reader/utterancesourcemulti.h (7 lines):
	- line 325: // TODO: this may go away if we store classids directly in the utterance data
	- line 344: // TODO: the following is not templated--do it if needed; also should return a const reference then
	- line 539: // TODO: also check the #frames here; requires a design change of the TOC format & a rerun
	- line 552: // TODO: we can store labels more efficiently now since we don't do frame-wise random access anymore.
	- line 683: // TODO: above push_back does not actually 'move' because the internal push_back does not accept that
	- line 790: foreach_index (i, allchunks) // TODO: this cries for iterating using the iterator!
	- line 864: foreach_index (k, randomizedchunks[0]) // TODO: this really cries for iterating using iterators!


Source/ActionsLib/OtherActions.cpp (7 lines):
	- line 211: // TODO: just use a lambda
	- line 248: int nbrCls = config(L"nbrClass", "0"); // TODO: why int and not size_t?
	- line 295: ifstream fp(inputFile.c_str()); // TODO: use class File, as to support pipes
	- line 364: // Implements an algorithm by Mikolov --TODO: get the reference
	- line 369: double unkCount = 0; // TODO: why double?
	- line 378: double freq = q.top().second; // TODO: why double?
	- line 480: // TODO: use safe-save, i.e. write to temp name and rename at the end


Source/Math/GPUMatrixCUDAKernels.cuh (7 lines):
	- line 51: assert(false); // TODO: implement later
	- line 56: // TODO: replace this with TensorOps.h LogAdd(). It differs in using ElemType throughout, while this one seems to use 'double' versions of exp() and log().
	- line 133: m_threadsPerBlock = N; // don't launch more than necessary  --TODO: Does this make a difference at all?
	- line 174: // TODO: drop call_once and co. and make cached devices a local static, once we're on VS2015.
	- line 3142: // TODO: This function can be further improved by loading the kernel in shared memory
	- line 3488: atomicAdd(&resultValues[IDX2C(lhsRow, resultCol, numRowsLhs)], lhsValue * rhsVal); //TODO: this does not work with fp16 for sparse embedding
	- line 5237: // TODO: This kernel has very poor performace and needs to


Source/Readers/HTKMLFReader/msra_mgram.h (6 lines):
	- line 29: // ... TODO (?): return true/false to indicate whether anything changed.
	- line 38: // ... TODO: ensure iterators do not return OOVs w.r.t. user symbol table
	- line 41: // ... TODO: change this to key() or something like this
	- line 78: // ... TODO: use the proper constants here (slightly inconsistent)
	- line 1164: // ^^ TODO: can we do away with this entirely and replace it by map.order()/this->order()
	- line 1175: // ... TODO: rethink the resize business. It is for shrinking only.


Source/Readers/UCIFastReader/UCIFastReader.cpp (6 lines):
	- line 330: // string name = configFeatures.Name();            // TODO: Aaargh!!!
	- line 409: // TODO: We use the old CNTK config reader for this. With BrainScript, we would have to parse the file locally here, which should be easy.
	- line 457: // TODO: We could implement an overlay IConfigRecord implementation that fakes the two values that are being added to the interface.
	- line 459: // TODO: We could copy the IConfigRecordPtr. That is allowed. Not trivial to do with template magic.
	- line 477: // TODO: need to go down to all levels, maybe search for sectionType
	- line 816: // TODO: Why can we not just pass m_prefetchMatrices?


Source/ComputationNetworkLib/ComputationNetworkScripting.cpp (6 lines):
	- line 58: // TODO: This currently only supports nodes of the same ElemType. We could allow conversion operators.
	- line 75: // TODO: process "outputNodes" etc. arrays: Sync to node Tags, and make them all roots.
	- line 191: // TODO: Is there more than nodes that we want to return? Node groups? deviceId?
	- line 208: // TODO: What is the expressionPath?
	- line 276: const wstring& expressionName = nodeName;   // TODO: think this through
	- line 521: // TODO: Is this really always an error? Are there valid cases where one would over-specify possible input nodes, even if they are not used/needed?


Source/ComputationNetworkLib/ComputationNetworkBuilder.cpp (6 lines):
	- line 144: // TODO: DiagTimes is also an alias of ElementTimes; current separate implementation is unnecessary.
	- line 274: // TODO: Do we really need these? Folks who want to use C++ can instead say net->AddNodeToNet(New<>(...)), which is not that different.
	- line 275: // TODO: separate into nodes that have inputs and those that duplicate functions with input adding except just not adding inputs. Clear?
	- line 280: // TODO: in SimpleNetworkBuilder, this is very often followed by InitLearnableParameter()--we should have an overload that just does it right away
	- line 292: // TODO: change these to take an actual object instead of a name for dynamicAxis
	- line 399: // TODO: Do we need both this set and the one above that does not add inputs? Can they share more code?


bindings/python/cntk/cntk_py.i (6 lines):
	- line 235: // FIXME use not yet existing NDShape function that returns the dimensions at once
	- line 315: /* FIXME We would love to do the following, but the hashing does not
	- line 840: // FIXME: The following is not picked up yet, which is why we have to tag it to
	- line 1012: /* FIXME We would love to do the following, but the hashing does not
	- line 1520: // TODO: do progressWriters below also have a similar issue?
	- line 1756: // FIXME ignore is ignored


Source/CNTK/CNTK.cpp (6 lines):
	- line 90: // TODO: if there is already a file, rename it
	- line 112: // TODO: Clarify how a single thread restriction can be lifted.
	- line 438: // TODO: There is a lot of duplication between this function and the NDL version.
	- line 594: // TODO: When running in parallel with MPI, only commands in 'commandstoRunOnAllRanks' should
	- line 609: // TODO: change this back to COMPLETED, double underscores don't look good in output
	- line 940: // TODO: change to STL containers


Source/Math/latticefunctionskernels.h (6 lines):
	- line 42: #else // TODO: remove this once we got this figured out
	- line 46: #define force_crash() (*((int *) -1) = 0)         // TODO: this does not in fact seem to crash it...
	- line 69: // TODO: either check when creating this whether this assumption is true, or control this through a flag in here.
	- line 229: } // TODO: ain't there an overload for this?
	- line 364: { // TODO: alignresult will change to (start,end)
	- line 633: // TODO: change this later on


Source/Math/CommonMatrix.h (6 lines):
	- line 41: // TODO: merge these two types
	- line 222: // TODO: remove all formats that are actually not supported
	- line 232: // TODO: remove all formats that are actually not supported
	- line 544: // TODO: Some of these accessors should be merged into single methods like SetBuffer.
	- line 638: // TODO: m_sliceViewOffset has a different meaning in sparse (column offset) versus dense (byte offset to start of pointer). This should perhaps be fixed.
	- line 640: // TODO: implement m_colStride


Source/ComputationNetworkLib/ComputationNode.cpp (6 lines):
	- line 10: #include "ComputationNetworkBuilder.h" // TODO: We should only pull in NewComputationNodeFromConfig(). Nodes should not know about network at large.
	- line 558: // TODO: Turn rank into a member variable, and call this method once in validation (currently called for every single ForwardProp/BackpropTo()).
	- line 576: TensorShape tensorShape = GetSampleLayout(); // TODO: Do we need to expect this tensor to have arbitrary strides? In case it came out of a Slice, Reshape, or Transpose op in-place?
	- line 694: if ((!IsLeaf() || Is<RandomDistributionNode<ElemType>>()) && !RequiresPreCompute()) // TODO: guard this through overrides instead
	- line 1038: // TODO: This can be done more nicely. We should keep the block structure.
	- line 1164: // TODO: change those strings into wstrings to avoid this conversion mess


Source/ActionsLib/NDLNetworkBuilder.cpp (6 lines):
	- line 94: // TODO: Map dynamicAxis from name to node at this point, where that node is memoized inside NDL.
	- line 336: // TODO: allow a tensor descriptor. Or allow 0 (inference). Maybe already supported--check this.
	- line 349: float defaultHiddenActivity = node->GetOptionalParameter("defaultHiddenActivity", "0.1"); // TODO: parameter should be called 'defaultHiddenActivation'
	- line 625: // TODO: Is there a better way to discriminate?
	- line 751: #else       // TODO: delete this
	- line 754: // TODO: just use a vector attach


Source/CNTKv2LibraryDll/Utils.cpp (6 lines):
	- line 43: // TODO: replace this copy with an alias to value.
	- line 425: //TODO: Need to record the original rate and the reference mbsize so that the unit gain factor can be computed correctly.
	- line 563: // TODO: This is a temporary debugging aid and should be removed after the functionality to late bind
	- line 570: // TODO: Is supplying dense data for an Input variable tagged as sparse, a fatal error even for packed value objects?
	- line 590: // TODO: try and remove support for this in the future, change the condition below to
	- line 928: //TODO: Consider using a vector/unique_ptr here to avoid potential memory leaks


Source/Common/DataReader.cpp (5 lines):
	- line 43: // TODO: This should be a runtime check, not an assert() that only runs in Debug.
	- line 82: #pragma optimize("", off) // TODO work around potential VS2015 code optimization bug, replacing virtual- by non-virtual call in Init() below
	- line 138: // TODO: merge with the code above, but we first need to get the nbrUttPerMinibatch initialized inside each reader
	- line 144: // pass on some global option    --TODO: Why is this not done inside each reader??
	- line 261: m_dataReaders[m_ioNames[i]]->SetNumParallelSequences(nbr); // the first one determines the param of all others --TODO: This is flimsy.


Source/Common/Include/basetypes.h (5 lines):
	- line 37: #include "Windows.h" // for CRITICAL_SECTION and Unicode conversion functions   --TODO: is there a portable alternative?
	- line 111: // class fixed_vector - non-resizable vector  --TODO: just use std::vector
	- line 132: // ... TODO: when I make this public, LinearTransform.h acts totally up but I cannot see where it comes from.
	- line 331: // ... TODO: change all of basetypes classes/structs to use this
	- line 411: // TODO: This will fail to compile under VS 2008--we need an #ifdef around this


Source/Readers/CNTKTextFormatReader/TextParser.cpp (5 lines):
	- line 316: // TODO: reuse loaded sequences instead of creating new ones!
	- line 366: // TODO this handling needs to be graceful, but currently CNTK complains when we return empty sequences.
	- line 961: // TODO: better precision (at the moment we're at parity with UCIFast)?
	- line 1062: // TODO: ignore if number of precision digits > FLT_[MANT_]DIG/DBL_[MANT_]DIG
	- line 1135: // TODO: check the exponent value (see FLT_[MAX/MIN]_10_EXP).


Source/Common/File.cpp (5 lines):
	- line 168: // TODO:
	- line 176: #ifdef CNTK_UWP // UWP-TODO: find a replacement for PathRemoveFileSpec
	- line 399: // TODO: why not write a BOM?
	- line 403: // TODO: What??
	- line 646: // TODO: This function actually consumes the white-space characters. Document that behavior.


Source/Math/CPUMatrixTensorImpl.h (5 lines):
	- line 196: // TODO: According to Amit, the VS compiler is not able to vectorize into lambdas. Solution: change the lambda to take an N, or to implement the loop inside (with 1 element by default).
	- line 197: // TODO: The signedness of k (required for omp) causes an extra sign-extend.
	- line 198: // TODO: OMP adds LOTS of overhead. Do we need a guard, a min size when to use it?
	- line 367: // TODO: apdapt e2e tests to run with aggregator of type ElemType.
	- line 432: // TODO: Change the lambda to take a pointer and a number of elements, so that we can pass it 1 or 4 elements, in order for it to SSE-vectorize.


Source/CNTKv2LibraryDll/TrainingSession.cpp (5 lines):
	- line 267: // TODO: Possibly expose a limiting counter on the number of samples for validation.
	- line 276: // TODO: This is meant as a stop gap, the minibatch source should be properly drained instead.
	- line 298: // TODO: it may be slow to rely on TestMinibatch to return error each time, since it may require transfer
	- line 388: // TODO: is copy really necessary here?
	- line 498: // TODO: Should have proper loggin instead.


Source/ComputationNetworkLib/ComputationNetworkBuilder.h (5 lines):
	- line 37: // TODO: can these be changed to ComputationNodeBasePtr?
	- line 38: // TODO: move into a separate header/class, to decouple from this class which would then be only used by old NDL and SimpleNetworkBuilder.
	- line 46: // TODO: Do we really need these? Folks who want to use C++ can instead say net->AddNodeToNet(New<>(...)), which is not that different.
	- line 47: // TODO: separate into nodes that have inputs and those that duplicate functions with input adding except just not adding inputs. Clear?
	- line 94: // TODO: Do we need both this set and the one above that does not add inputs? Can they share more code?


Source/Readers/ReaderLib/ReaderShim.cpp (5 lines):
	- line 27: // TODO: Currently there is implementation of these in the V2 library, but it is not exposed and requires linking dependency.
	- line 230: // TODO use boost::algorithm::join, boost::adapters::transformed, make this a generic function
	- line 248: // TODO: verify that the set of matrix names is identical
	- line 256: //TODO: Set proper format on matrices?
	- line 445: // TODO: We should return 0 here.


Source/Readers/Kaldi2Reader/latticearchive.cpp (4 lines):
	- line 168: // TODO: This is sort of redundant now--it gets the symmap from the HMM, i.e. always the same for all archives.
	- line 428: // TODO: I find that HVite emits redundant physical triphones, and even HHEd seems so (in .tying file).
	- line 495: // convert it  --TODO: once we permanently use the new format, do this in fread() for V1
	- line 671: const auto &transcripts = labels.allwordtranscripts(); // (TODO: we could just pass the transcripts map--does not really matter)


Source/ComputationNetworkLib/ComputationNetworkEditing.cpp (4 lines):
	- line 179: if (newNode->NodeName() != nodeName) // TODO: This was not tested for earlier; I hope no code depends on this.
	- line 190: newNode->SetInput(i, oldNode->GetInputs()[i]); // TODO: use AttachInput()?
	- line 245: // TODO: Can this be called with a node that's already part of the network? This is currently allowed, but should it?
	- line 266: DeleteNode(oldNodeName); // TODO: can this just be RemoveNodeFromNet()?


Source/Readers/HTKMLFReader/rollingwindowsource.h (4 lines):
	- line 288: // TODO:
	- line 388: if (numframes != classids.size()) // TODO: remove this once we are confident
	- line 573: // TODO:
	- line 716: if (numframes != classids[j].size()) // TODO: remove this once we are confident


Source/Readers/HTKMLFReader/latticearchive.cpp (4 lines):
	- line 156: // TODO: This is sort of redundant now--it gets the symmap from the HMM, i.e. always the same for all archives.
	- line 416: // TODO: I find that HVite emits redundant physical triphones, and even HHEd seems so (in .tying file).
	- line 483: // convert it  --TODO: once we permanently use the new format, do this in fread() for V1
	- line 659: const auto &transcripts = labels.allwordtranscripts(); // (TODO: we could just pass the transcripts map--does not really matter)


Source/Readers/HTKMLFReader/biggrowablevectors.h (4 lines):
	- line 16: // TODO:
	- line 41: // TODO: update allocated range
	- line 50: // TODO: update allocated range  --also enforce consecutiveness
	- line 59: // TODO: update allocated range  --also enforce consecutiveness


bindings/python/cntk/variables.py (4 lines):
	- line 200: _unknown_shape = (-2,) # TODO: take this from the catacombs of cntk_py
	- line 236: if axis.name == 'defaultBatchAxis':  # axis == Axis.default_batch_axis():  --TODO: how to do this right?
	- line 238: if axis.name == 'defaultDynamicAxis' or axis.name == 'UnknownAxes': # TODO: how to do this right?
	- line 256: # TODO: it would be great if in a future version we could recognize and support Python 3.5 typing.Sequence


Source/Readers/HTKDeserializers/HTKDeserializer.cpp (4 lines):
	- line 30: // TODO: This should be read in one place, potentially given by SGD.
	- line 173: // TODO: We should be able to configure IO chunks based on size.
	- line 519: // TODO: Move augmentation to the separate class outside of deserializer.
	- line 520: // TODO: Check the CNTK Book why different left and right extents are not supported.


Source/ComputationNetworkLib/ComputationNetworkAnalysis.cpp (4 lines):
	- line 34: //  - the cached m_evalOrders[root], reordered to make nodes belonging to the same loop consecutive. TODO: Try not to do that.
	- line 60: flowControlNode.m_nestedNodes = c.Nodes(); // TODO: make these two part of the constructor
	- line 71: // TODO: Get rid of this after-the-fact patch.
	- line 102: // TODO: Move this up to where it is used (in a separate commit since git cannot track moving and changing at the same time).


Source/CNTKv2LibraryDll/DistributedCommunicator.cpp (4 lines):
	- line 113: // TODO: device.Type should be called Kind.
	- line 263: // TODO: Currently we only support concatenation of inputs of the same size.
	- line 395: // TODO: actually, we can start reducing all cpu values first, and then wait for the gpu->cpu transfer to finish.
	- line 482: // TODO: Should not wait, simply publishing event on the compute stream should be sufficient


Source/Readers/CNTKTextFormatReader/CNTKTextFormatReader.cpp (4 lines):
	- line 22: // TODO: This class should go away eventually.
	- line 23: // TODO: The composition of packer + randomizer + different deserializers in a generic manner is done in the CompositeDataReader.
	- line 24: // TODO: Currently preserving this for backward compatibility with current configs.
	- line 43: // TODO: drop "verbosity", use config.traceLevel() instead.


Source/ComputationNetworkLib/NonlinearityNodes.h (4 lines):
	- line 191: // TODO: This was used more broadly, but no longer, so we may be able to simplify the signatures of the virtual functions.
	- line 228: // TODO: once this gets reimplemented using TensorView, then this is no longer needed.
	- line 417: // TODO: make function value sparse?
	- line 592: // TODO: This is a stopgap. Is this the right thing to do? It changes the matrix type in-place.


Source/CNTKv2LibraryDll/Trainer.cpp (4 lines):
	- line 168: // TODO: exclude updating progress writers from profiling?
	- line 190: // TODO: exclude updating progress writers from profiling?
	- line 397: // TODO: Why Backward signature does not take Parameter instead of Variable for gradients?
	- line 527: // TODO: better return 0; it is then still valid to compute lossAverage * numSamples


Source/Readers/Kaldi2Reader/biggrowablevectors.h (4 lines):
	- line 16: // TODO:
	- line 41: // TODO: update allocated range
	- line 50: // TODO: update allocated range  --also enforce consecutiveness
	- line 59: // TODO: update allocated range  --also enforce consecutiveness


Source/SGDLib/SimpleEvaluator.h (3 lines):
	- line 15: #include "TrainingNodes.h" // TODO: we should move the functions that depend on these to the .cpp
	- line 34: // TODO: get rid of dependency on ElemType
	- line 179: // TODO: We are reusing the aggregation logic inside SimpleDistGradAggregator, which has a heavy dependency


Source/Readers/ImageReader/Exports.cpp (3 lines):
	- line 24: // TODO: Memory provider should be injected by SGD.
	- line 41: //TODO: Names of transforms and deserializers should be case insensitive.
	- line 43: // TODO: Not safe from the ABI perspective. Will be uglified to make the interface ABI.


Source/Math/MatrixQuantizerGPU.cu (3 lines):
	- line 23: // TODO: get from an env variable
	- line 219: // TODO: Check for error code and throw if !std::uncaught_exception()
	- line 359: // TODO: Check for error code and throw if !std::uncaught_exception()


Source/Common/Include/numahelpers.h (3 lines):
	- line 20: // ... TODO: this can be a 'static', as it should only be set during foreach_node but not outside
	- line 86: // ... TODO: honor ppl_cores == 1 for comparative measurements against single threads.
	- line 252: return;     // ... TODO: tell parallel_for_on_each_numa_node() to only have one step, or parallelize


Source/Math/ValueQuantizer.h (3 lines):
	- line 88: size_t usedrangeend = rangeend - (Nbits > 1); // TODO: make this a parameter
	- line 100: // TODO: we can optimize for 1 bit here - very simply use a template arg 'isonebit'
	- line 108: // TODO: we may need to optimize this by a template arg


Source/Common/Include/simplesenonehmm.h (3 lines):
	- line 27: public:                                // (TODO: better encapsulation)
	- line 111: std::vector<transP> transPs;                       // the transition matrices  --TODO: finish this
	- line 215: // TODO: this becomes a hard lookup with failure


Source/Readers/LMSequenceReader/SequenceReader.h (3 lines):
	- line 288: // TODO: ^^ should this be   void CopyMBLayoutTo(MBLayoutPtr pMBLayout);
	- line 423: size_t GetNumParallelSequencesForFixingBPTTMode() override { return mToProcess.size(); } // TODO: or get it from MBLayout? Can this ever be called before GetMinibatch()?
	- line 425: // TODO: what are these?


Source/Readers/Kaldi2Reader/numahelpers.h (3 lines):
	- line 22: // ... TODO: this can be a 'static', as it should only be set during foreach_node but not outside
	- line 84: // ... TODO: honor ppl_cores == 1 for comparative measurements against single threads.
	- line 237: return;     // ... TODO: tell parallel_for_on_each_numa_node() to only have one step, or parallelize


Source/Math/TensorView.cpp (3 lines):
	- line 9: // TODO:
	- line 118: shapes[i].FlattenInPlace(k);                          // TODO: overdoing the immutable thingy much?
	- line 151: for (size_t i = 0; i < N; i++) // TODO: do we need to test output tensor here as well?


Source/Common/Include/File.h (3 lines):
	- line 22: #include <fstream>    // for LoadMatrixFromTextFile() --TODO: change to using this File class
	- line 259: // This function does not quite fit here, but it fits elsewhere even worse. TODO: change to use File class!
	- line 270: // TODO: Move this to class File, as this is similar in nature to LoadMatrixFromTextFile().


Source/Readers/ReaderLib/BlockRandomizer.cpp (3 lines):
	- line 278: // TODO: move 'PrepareNewSweepIfNeeded' inside the sequence randomizer and drop this requirement.
	- line 287: // TODO: should we just drop this flag and return false if we cannot fulfil this request?
	- line 386: // TODO diagnostics for paged out chunks?


Source/ComputationNetworkLib/ReshapingNodes.cpp (3 lines):
	- line 403: // TODO: Where should the MBLayout be created--in BeginForwardProp() or ForwardProp()?
	- line 612: // inherit tensor dimension from sourceData, minus the last (column or time) dimension. TODO this needs to become simpler...
	- line 685: // TODO: We also know that indexData and sourceData must have the same MBLayout. But that is checked at runtime.


Source/Readers/Kaldi2Reader/HTKMLFReader.cpp (3 lines):
	- line 1419: &matrices.GetInputMatrix<ElemType>(iter->first)); // TODO: use a reference instead of a ptr
	- line 1437: &matrices.GetInputMatrix<ElemType>(iter->first)); // TODO: use a reference instead of a ptr
	- line 1912: const auto& outputs = dynamic_cast<const Matrix<ElemType>&>(outputsb); // TODO: a NULL check, to be sure


Source/Readers/ReaderLib/SequenceRandomizer.cpp (3 lines):
	- line 185: // TODO: This can be done more efficiently, we know the range of chunks already.
	- line 224: // TODO assert only
	- line 311: // TODO most of the time, we can advance to the right sequence here


Source/Math/CntkBatchNormalization.cuh (3 lines):
	- line 220: double expAvgFactor, // TODO why not ElemType? same for the other parameters, functions?
	- line 377: // TODO add back special cases
	- line 564: // TODO add back special cases


Source/Readers/CNTKTextFormatReader/Exports.cpp (3 lines):
	- line 20: // TODO: Memory provider should be injected by SGD.
	- line 37: // TODO: Not safe from the ABI perspective. Will be uglified to make the interface ABI.
	- line 47: // TODO: Remove type from the parser. Current implementation does not support streams of different types.


Source/CNTKv2LibraryDll/proto/onnx/Operators.h (3 lines):
	- line 25: // TODO: support cases where batch size is not 1.
	- line 40: // TODO: this does not work completely.
	- line 41: // TODO: Waiting Skype smart reply with attention model before enabling the functionality of tracking sequence dimension.


Source/Math/MatrixQuantizerCPU.cpp (3 lines):
	- line 16: // TODO: Support transferring the quantization output to a quantized matrix on the GPU
	- line 68: // TODO: Currently this is a no-op since the actual quantization is synchronous
	- line 105: // TODO: Currently this is a no-op since the actual quantization is synchronous


Source/Readers/HTKMLFReader/HTKMLFReader.h (3 lines):
	- line 48: vector<size_t> m_switchFrame;        // TODO: something like the position where a new sequence starts; still supported?
	- line 113: void fillOneUttDataforParallelmode(StreamMinibatchInputs& matrices, size_t startFr, size_t framenum, size_t channelIndex, size_t sourceChannelIndex); // TODO: PascalCase()
	- line 151: // TODO: this ^^ does not seem to belong here.


Source/CNTKv2LibraryDll/Learner.h (3 lines):
	- line 128: // TODO: make these functions friends of NDViewArray and move to Utils?
	- line 205: //TODO: Preliminary study shows that the unitgain factor should use the raw momentum instead of the scaled momentum as the following:
	- line 358: //TODO: According to my preliminary analysis, the second momentum variance scaling is different from momentum scaling; need to double check -- yuqing tang


Source/Readers/LMSequenceReader/SequenceParser.cpp (3 lines):
	- line 294: // TODO: In sequence reader we probably don't need to store numbers in labels (we'll see)
	- line 351: // TODO: Is this ever called with anything other than 0?
	- line 585: return (long) linecnt; // TODO: change to size_t


Source/Readers/ReaderLib/SequenceRandomizer.h (3 lines):
	- line 30: // TODO: This code is still based on the old behavior, so that all current tests pass.
	- line 31: // TODO: Can be simplified if we only randomized sequences forward.
	- line 138: // TODO consider to change to ChunkIdType where appropriate


Source/SequenceTrainingLib/gammacalculation.h (3 lines):
	- line 158: // TODO: Adapt this to new MBLayout, m_sequences would be easier to work off.
	- line 454: // TODO: This function is duplicate of the one in HTLMLFReader.
	- line 474: // TODO: This function is duplicate of the one in HTLMLFReader.


Source/Readers/HTKDeserializers/MLFDeserializer.h (3 lines):
	- line 76: // TODO: Should be removed, when all readers go away, expects configuration in a legacy mode.
	- line 184: // TODO: Possibly set m_valid to false, but currently preserving the old behavior.
	- line 332: // TODO: Possibly set m_valid to false, but currently preserving the old behavior.


Source/CNTKv2LibraryDll/API/CNTKLibraryInternals.h (3 lines):
	- line 67: // TODO: The following should be reconciled with the equivalent code in the CNTK implementation
	- line 94: #ifndef _MSC_VER // TODO: what is the correct trigger for gcc?
	- line 451: // TODO: replace by std::optional, once it's fully supported by VS.


bindings/python/doc/conf.py (3 lines):
	- line 48: # TODO nitpick_ignore
	- line 53: version = cntk.__version__ # TODO consider shortening
	- line 86: # TODO temporary


Source/ComputationNetworkLib/RecurrentNodes.h (3 lines):
	- line 25: // TODO: 'direction' is really too general. signOfTimeOffset?
	- line 90: shared_ptr<Matrix<ElemType>> m_zeroMatrix;              // constant [1]-dimensional 0 used for backprop  --TODO: could use a static map[deviceId]
	- line 105: // TODO: Can this just be a typedef?


bindings/python/cntk/io/__init__.py (3 lines):
	- line 980: # TODO: this should be a private class; use StreamDef instead
	- line 1061: # TODO: we should always use 'shape' unless it is always rank-1 or a single rank's dimension
	- line 1062: # TODO: dim should be inferred from the file, at least for dense


bindings/python/cntk/layers/higher_order_layers.py (3 lines):
	- line 18: # TODO: should we have a parameter to specify the arity of the input?
	- line 86: # TODO: Is this confusing w.r.t. tuple which is parallel and list which is sequential?
	- line 203: # TODO: consider potential name clash; users might want to call their functions the same.


Source/Common/Include/RandomOrdering.h (3 lines):
	- line 31: // TODO: Switching to Boost would eliminate this problem.
	- line 41: // TODO: Switching to Boost would eliminate this problem.
	- line 125: const size_t tbegin = max((size_t) t, randomizationrange / 2) - randomizationrange / 2; // range of window  --TODO: use bounds() function above


Source/Readers/CNTKBinaryReader/CNTKBinaryReader.cpp (3 lines):
	- line 21: // TODO: This class should go away eventually.
	- line 22: // TODO: The composition of packer + randomizer + different deserializers in a generic manner is done in the CompositeDataReader.
	- line 23: // TODO: Currently preserving this for backward compatibility with current configs.


Source/ActionsLib/TrainActions.cpp (3 lines):
	- line 56: // TODO: CNTK config added "traceLevel = 0" to 'config'. In BS, we cannot do that (IConfigRecord is immutable). Solution: Just say "traceLevel = 0" in the BS macros for readers.
	- line 73: objConfig.Insert("traceLevel", config(L"traceLevel", "0")); // TODO: fix this by adding it to all config blocks. Easy to fix in BS as 'config with [ traceLevel = 0 ]'.
	- line 171: // TODO: remove this


Source/SGDLib/SimpleOutputWriter.h (3 lines):
	- line 114: // TODO: What should the data size be?
	- line 148: // TODO: Remove code dup with above function by creating a fake Writer object and then calling the other function.
	- line 204: std::map<ComputationNodeBasePtr, shared_ptr<File>> outputStreams; // TODO: why does unique_ptr not work here? Complains about non-existent default_delete()


Source/Common/Include/ProgressTracing.h (3 lines):
	- line 25: // TODO: make this proper C++ functions with variadic templates and a name that reflects their difference to fprintf(stderr) which already implies printing to log
	- line 55: bool m_timestampFlag;        // TODO: What does this do? TODO: camelCase
	- line 80: // TODO: timestampFlag or timestampingFlag? (Or timeStampFlag?)


Source/Readers/ImageReader/ImageReader.cpp (3 lines):
	- line 22: // TODO: This class should go away eventually.
	- line 23: // TODO: The composition of packer + randomizer + different deserializers in a generic manner is done in the CompositeDataReader.
	- line 24: // TODO: Currently preserving this for backward compatibility with current configs.


Source/Readers/CompositeDataReader/CompositeDataReader.cpp (3 lines):
	- line 161: // TODO: decrease randomization window if m_deserializers.size() > 1 ?
	- line 183: // TODO: Output stream descriptions - this should come from the network so that we can check
	- line 252: // TODO: Should go away in the future. Framing can be done on top of deserializers.


bindings/python/setup.py (2 lines):
	- line 22: # TODO should handle swig path specified via build_ext --swig-path
	- line 216: extra_link_args = [] # TODO: LINKER_DEBUG_ARG is not passed in to avoid compilation error


Source/CNTKv2LibraryDll/BackCompat.cpp (2 lines):
	- line 112: // TODO: Currently only default dynamic axis is supported
	- line 119: // TODO: Allow creating inputs without a dynamic axis


Source/Math/cudalattice.cpp (2 lines):
	- line 22: extern void operator||(cudaError_t rc, const char *msg); // TODO: imported from cudamatrix.cpp --better move to cudalib.h
	- line 25: // TODO: This really should not be in cudalattice, since it is more general; we need a cudavector.cpp/h


Source/Readers/HTKMLFReader/minibatchiterator.h (2 lines):
	- line 188: // TODO not nice, but don't know how to access these frames otherwise
	- line 288: // User is allowed to manipulate the frames... for now--TODO: move silence filtering here as well


Source/Readers/CNTKBinaryReader/BinaryChunkDeserializer.cpp (2 lines):
	- line 23: // TODO: compressed_sparse_csc = 2, // indices are encoded as var-ints
	- line 181: // TODO: use a pool of buffers instead of allocating a new one, each time a chunk is read.


Source/CNTKv2LibraryDll/proto/onnx/RNNHelper.cpp (2 lines):
	- line 490: // TODO: do not transpose after RNN ops so we have one code path here.
	- line 624: // TODO: get H1


Source/Readers/LUSequenceReader/LUSequenceReader.h (2 lines):
	- line 307: } // TODO: clean this up
	- line 378: size_t mMaxSentenceLength;       // max over mSentenceLength[]  --TODO: why not compute on the fly?


Source/Readers/ImageReader/ImageDeserializerBase.cpp (2 lines):
	- line 74: // TODO: multiview should be done on the level of randomizer/transformers - it is responsiblity of the
	- line 75: // TODO: randomizer to collect how many copies each transform needs and request same sequence several times.


bindings/python/cntk/losses/__init__.py (2 lines):
	- line 49: # TODO: Per discussion with sayanp, the underlying C++ code is not fully functional, so this
	- line 398: # TODO: fix the bug in backprop for sparse, and use sparse embedding to accelerate


Source/CNTKv2LibraryDll/NDMask.cpp (2 lines):
	- line 43: // TODO: Implement batching of masking operation for masks residing on GPUs to avoid making
	- line 92: // TODO: This could actually be strided?


Source/Readers/HTKDeserializers/HTKMLFReader.h (2 lines):
	- line 15: // TODO: Should be deprecated. Composite reader should be used instead.
	- line 38: // TODO: Should be moved outside of the reader.


Source/Math/GPUDataTransferer.cpp (2 lines):
	- line 19: // TODO: get from an env variable
	- line 75: // TODO: Check for error code and throw if !std::uncaught_exception()


Source/Readers/ReaderLib/BlockRandomizer.h (2 lines):
	- line 35: // TODO: The behavior can be simplified by only randomizing sequences forward.
	- line 145: // TODO generalize those for ReaderLib / Reader / CNTK


Source/EvalDll/EvalReader.h (2 lines):
	- line 66: m_switchFrame[0] = m_mbSize + 8888; // TODO: WTF??
	- line 195: //    pMBLayout->Set(0, m_switchFrame[0] - 1, MinibatchPackingFlags::SequenceEnd);   // TODO: can't we use Set()?


Source/Math/fpgeneric.h (2 lines):
	- line 11: // NV_TODO: optimize speed -- pass things needed in, optimize kernel speed, add half2
	- line 12: // NV_TODO: investigate cub support for half


Source/ComputationNetworkLib/ComputationNodeScripting.cpp (2 lines):
	- line 10: #include "ComputationNetworkBuilder.h" // TODO: We should only pull in NewComputationNodeFromConfig(). Nodes should not know about network at large.
	- line 51: // TODO: Think through what tags mean. Do we allow user-named tags? Is it a set or a single string? If set, then how to compare?


Source/Readers/CNTKBinaryReader/Exports.cpp (2 lines):
	- line 21: // TODO: Memory provider should be injected by SGD.
	- line 44: // TODO: do we want to support non-primary binary deserializers?


Source/Readers/HTKDeserializers/HTKDeserializer.h (2 lines):
	- line 26: // TODO: Should be removed, when legacy config goes away, expects configuration in a legacy mode.
	- line 83: // TODO: This should be moved to the packers when deserializers work in sequence mode only.


bindings/python/cntk/train/trainer.py (2 lines):
	- line 60: # TODO sanitizing should be removed once Swig's typemaps are in place
	- line 77: # TODO: bring this back once the design has been settled


bindings/python/cntk/core.py (2 lines):
	- line 298: # TODO: Add direct conversion, since creating an intermediate array might be slow
	- line 538: elif not isinstance(batch, list): # TODO: allow general iterables


bindings/python/cntk/logging/progress_print.py (2 lines):
	- line 28: # TODO: Let's switch to import logging in the future instead of print. [ebarsoum]
	- line 102: # TODO: this is for internal purposes, so find better way


bindings/common/CNTKManagedCommon.i (2 lines):
	- line 656: // TODO: make Java binding deal with double*, float * and int * correctly.
	- line 748: // TODO: make the following methods also private in Java, after CreateBatch/CreateSequence/... methods are implemented there.


bindings/python/cntk/layers/models/attention.py (2 lines):
	- line 63: # TODO: pull this apart so that we can compute the encoder window only once and apply it to multiple decoders
	- line 75: u_masked = u + (h_enc_valid - 1) * 50     # logzero-out the unused elements for the softmax denominator  TODO: use a less arbitrary number than 50


Source/Readers/HTKDeserializers/Exports.cpp (2 lines):
	- line 43: // TODO: Must be removed when SGD is moved to an untyped matrix.
	- line 59: // TODO: Not safe from the ABI perspective. Will be uglified to make the interface ABI.


Source/Readers/CompositeDataReader/CompositeDataReader.h (2 lines):
	- line 50: // TODO: Implement proper corpus descriptor.
	- line 51: // TODO: Change this interface when SGD is changed.


Source/Readers/ReaderLib/MemoryProvider.h (2 lines):
	- line 14: // TODO: Should be injected by CNTK into the reader (will be a member of Matrix class).
	- line 25: // TODO: add Resize function.


Source/Readers/ReaderLib/ChunkRandomizer.h (2 lines):
	- line 69: // TODO: Currently, we have to preserve the same behavior for randomization in order to make all tests pass.
	- line 70: // TODO: Randomization can be made simpler if we randomize only forwards.


Source/CNTKv2LibraryDll/NDArrayView.cpp (2 lines):
	- line 718: // TODO: This could actually be strided?
	- line 734: // TODO: This could actually be strided?


bindings/python/cntk/learners/__init__.py (2 lines):
	- line 539: # TODO figure out how to pass infty to C++ in a portable way
	- line 1212: #TODO: add additional options and learning context to the parameters of the updat_func so that the update function


Source/ComputationNetworkLib/ConvolutionalNodes.h (2 lines):
	- line 498: // TODO: the check for NeedsDynamicValidation() is a temporary resolution and needs to be properly handled when we look at support for free dimension convolution inputs.
	- line 1283: // TODO: need to add support for other pooling types, for example,


Source/Common/Include/pplhelpers.h (2 lines):
	- line 72: // ... TODO: Currently, 'cores' does not limit the number of threads in parallel_for() (not so critical, fix later or never)
	- line 90: // ... TODO: does the above actually do anything/significant? nfwd != targetstep?


bindings/python/cntk/contrib/deeprl/agent/agent.py (2 lines):
	- line 194: # TODO: consider using cntk.core.Value.one_hot here.
	- line 215: # TODO: allow float64 dtype.


Source/1BitSGD/QuantizedDistributedCommunicator.h (2 lines):
	- line 102: // TODO: Use using and virtual inheritance after switching to VS2015.
	- line 510: // TODO: Should we use MPI_Bcast instead for better performance


bindings/python/cntk/SwigDeserializer.h (2 lines):
	- line 251: // TODO: profile, probably need to have some form of
	- line 423: // TODO: profile, probably need to have some form of


Source/Readers/LUSequenceReader/DataWriterLocal.cpp (2 lines):
	- line 7: // TODO: Unify with shared DataWriter.cpp.
	- line 38: // TODO: don't we need to destroy ourselves?


Source/Math/cudalatticeops.cu.h (2 lines):
	- line 105: // cudaarrayref<float> logLLsarray;        // TODO: pass this in, of course
	- line 252: // TODO: is this really efficient? One thread per value?


Source/Math/cudabasetypes.h (2 lines):
	- line 17: #define ON_CUDA 0 // TODO: this does not work for some combination--fix this
	- line 31: typedef size_t cuda_size_t; // TODO: verify if this is consistent across CPU/CUDA, or use uint32 or so


Source/Common/Include/Platform.h (2 lines):
	- line 192: } // TODO: check if correct
	- line 196: } // TODO: correct for size_t?


Source/SGDLib/Criterion.h (2 lines):
	- line 13: #include <limits> // for isnan() and numeric_limits  --TODO: is that the right header?
	- line 142: // TODO: Verify that node->GetSampleLayout().GetNumElements() == 1. Require explicit summation to declare intent that this is a criterion.


Source/Common/MPIWrapper.cpp (2 lines):
	- line 302: // TODO: or does that only signal an issue, and we should still terminate ourselves?
	- line 310: // TODO: this is not threadsafe.


Source/Math/CuDnnBatchNormalization.cu (2 lines):
	- line 50: // TODO batchSize == 1
	- line 156: // TODO: Should not throw if std::uncaught_exception()


Scripts/ctf2bin.py (2 lines):
	- line 33: # TODO: use varint encoding for sparse indices,
	- line 185: # TODO: add a hash of the chunk


Source/Common/BestGpu.cpp (2 lines):
	- line 160: // TODO: Do we need to hold this pointer at all? We will only query it once. Or is it used to hold lock to a GPU?
	- line 292: // TODO: This is duplicated in GPUMatrix.cu


Source/Readers/Kaldi2Reader/rollingwindowsource.h (2 lines):
	- line 286: // TODO:
	- line 429: if (numframes != classids[j].size()) // TODO: remove this once we are confident


Source/Readers/Kaldi2Reader/DataWriter.cpp (2 lines):
	- line 7: // TODO: This is similar but not identical to Common/DataWriter.cpp. Why is this DataWriter different? Can it be reconciled?
	- line 21: // TODO: move these to Exports.cpp


Source/Readers/ReaderLib/Reader.h (2 lines):
	- line 47: // TODO: Should be deprecated.
	- line 102: // TODO: should be deprecated, SetConfiguration should be used instead.


Source/CNTKv2LibraryDll/proto/onnx/patch/onnxruntime/core/platform/windows/env.cc (2 lines):
	- line 156: // TODO: make sure O_TRUNC is added.
	- line 173: // TODO: make sure O_TRUNC is added.


bindings/python/cntk/layers/typing.py (2 lines):
	- line 182: tensor = Tensor[-2] # TODO: find the correct symbol for the sentinel value
	- line 198: # TODO: accept Python's typing.Sequence instead; then import layers.typing by default in layers.__init__.py


Source/Readers/ReaderLib/TruncatedBpttPacker.cpp (2 lines):
	- line 136: // TODO: add support for sparse.
	- line 343: // TODO: make type casts members of the SparseSequenceData


Source/Readers/Kaldi2Reader/minibatchiterator.h (2 lines):
	- line 169: // TODO not nice, but don't know how to access these frames otherwise
	- line 267: // User is allowed to manipulate the frames... for now--TODO: move silence filtering here as well


Source/ComputationNetworkLib/EvaluationNodes.h (2 lines):
	- line 76: // TODO: Make topK a constructor parameter
	- line 901: // TODO: member variables go to the end


Source/Readers/HTKMLFReader/DataWriterLocal.cpp (2 lines):
	- line 7: // TODO: This is similar but not identical to Common/DataWriter.cpp. Why is this DataWriter different? Can it be reconciled?
	- line 40: // TODO: do we need to destroy ourselves as well?


Source/Math/GPUTensor.cu (2 lines):
	- line 648: // TODO: use volatile* and then we can skip the __syncthreads() for the last 32 values. See Amit's allreduce() function implementation in MatrixQuantizer_kernel.cu.
	- line 860: static shared_ptr<ElemType> reductionBuffersCache[32]; // cache of objects    --TODO: Do we have a #define the max somewhere? Then also use it in CPUMatrix.cu GetOnesTensor()


Source/Readers/ReaderLib/TruncatedBpttPacker.h (2 lines):
	- line 19: // TODO: Currently supports only packing of streams with sequences of equal length.
	- line 70: // TODO: currently assume that layout is the same between different streams, this will change.


Source/Readers/HTKMLFReader/htkfeatio.h (2 lines):
	- line 334: // TODO make this nicer
	- line 429: // TODO \r should be handled elsewhere; refine this


Source/CNTKv2LibraryDll/MinibatchSource.cpp (2 lines):
	- line 330: // TODO: Remove call to StartEpoch - this API is legacy.
	- line 606: // TODO: This should be done in the same manner for CNTK exe as well.


bindings/python/cntk/contrib/deeprl/agent/policy_gradient.py (2 lines):
	- line 232: # TODO: allow user to specify learner through config file.
	- line 327: # TODO: consider using cntk.ops.one_hot instead of _index_to_vector


Source/Readers/HTKDeserializers/ConfigHelper.cpp (2 lines):
	- line 63: // TODO: let's deprecate this eventually and just use "type"...
	- line 325: // TODO: possibly change to class File, we should be able to read data from pipelines.E.g.


Source/Readers/ReaderLib/StringToIdMap.h (2 lines):
	- line 20: // TODO: Move this class to Basics.h when it is required by more than one reader.
	- line 85: // TODO: Move NonCopyable as a separate class to Basics.h


Source/Common/Include/StringUtil.h (2 lines):
	- line 44: // TODO: Should switch to boost, boost::iequal should be used instead.
	- line 45: // TODO: we already have EqualCI() in Basics.h which does the same thing.


Source/CNTKv2LibraryDll/Variable.cpp (2 lines):
	- line 489: // TODO: add a dictionary value constructor with an rvalue parameter.
	- line 549: // TODO: this copying here is redundant, value should be moved from the dictionary to the variable.


Source/CNTK/BrainScript/BrainScriptParser.h (2 lines):
	- line 103: wstring op; // operation, encoded as a string; 'symbol' for punctuation and keywords, otherwise used in constructors below ...TODO: use constexpr
	- line 143: // TODO: These rvalue references are no longer adding value, change to const<>&


Source/ActionsLib/SynchronousExecutionEngine.h (2 lines):
	- line 291: // TODO: It seems that this is also applied to other tyoes of nodes, so the name of this function is wrong.
	- line 325: // TODO JC Refactor eligible methods and members into abstract base class.


Source/Math/ColumnQuantizer.h (2 lines):
	- line 208: // TODO: further opportunity for speed-up: use 'mean' from last round for 1-bit and stddev calc
	- line 320: ElemType stddevs = 4.0f; // TODO: make this a parameter


Source/Math/TensorOps.h (2 lines):
	- line 97: short t = *(short*)&v & 0x7FFF;    //TODO: Check this!
	- line 212: return half(powf((float)v , (float)e));     //TODO: Improve efficiency?


Source/Math/half.hpp (2 lines):
	- line 7: // TODO: investigate performance of implementation, function signature and efficiency
	- line 169: // overload binary operators between 'half' and build-in type. TODO: This should be handled in a better way


Source/Readers/ReaderLib/NoRandomizer.h (2 lines):
	- line 17: // TODO: currently this code moved from the old block randomizer.
	- line 18: // TODO: The class will be further refactored and common based will be extracted with BlockRandomizer.


Source/ComputationNetworkLib/UserDefinedV2FunctionNode.h (2 lines):
	- line 230: // TODO: We unpack the same output gradients each time this method is called for a different input.
	- line 263: // TODO: We should directly pass the actual input gradient tensor to the Backward method


Source/1BitSGD/AllReduceDistGradAggregator.h (2 lines):
	- line 498: // TODO: Should we use MPI_Bcast instead for better performance
	- line 511: // TODO: Should we use MPI_Bcast instead for better performance


Source/Readers/Kaldi2Reader/pplhelpers.h (2 lines):
	- line 72: // ... TODO: Currently, 'cores' does not limit the number of threads in parallel_for() (not so critical, fix later or never)
	- line 90: // ... TODO: does the above actually do anything/significant? nfwd != targetstep?


Source/ComputationNetworkLib/PreComputeNodes.h (2 lines):
	- line 258: // TODO: share stuff with MeanNode
	- line 378: // TODO: Deprecate like PerDimMeanVarNormalizationNode as soon as we have a test case. Or just delete it.


bindings/python/onnx_cntk/backend.py (2 lines):
	- line 66: # TODO: make this work for multiple output case.
	- line 67: # TODO: support more types.


Source/Readers/HTKDeserializers/MLFBinaryDeserializer.cpp (2 lines):
	- line 164: // TODO: Possibly set m_valid to false, but currently preserving the old behavior.
	- line 185: LogicError("TODO: implement phoneBoundaries setting in Binary MLF deserializer.");


Source/Readers/HTKDeserializers/HTKFeaturesIO.h (2 lines):
	- line 6: // TODO: Currently borrowed from the old reader, should be refactored.
	- line 168: // TODO make this nicer


Source/Math/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Common/Globals.cpp (1 line):
	- line 13: // TODO: get rid of this source file once static initializers in methods are thread-safe (VS 2015)


Source/CNTKv2LibraryDll/Serialization.h (1 line):
	- line 31: const std::wstring minibatchCountKey = L"minibatchCount"; // TODO: Python-style spelling


Source/Common/Include/BestGpu.h (1 line):
	- line 86: // TODO: find a way to use CPUDEVICE without a huge include overhead; OK so far since CPUONLY mode is sorta special...


Source/Readers/CNTKTextFormatReader/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Readers/CompositeDataReader/stdafx.h (1 line):
	- line 23: // TODO: reference additional headers your program requires here


Source/SGDLib/stdafx.h (1 line):
	- line 21: // TODO: reference additional headers your program requires here


Source/Readers/Kaldi2Reader/htkfeatio.h (1 line):
	- line 264: // TODO make this nicer


Source/Readers/HTKDeserializers/LatticeDeserializer.cpp (1 line):
	- line 70: // TODO: switch to char when possible.


Source/Readers/BinaryReader/BinaryFile.cpp (1 line):
	- line 322: // TODO: this view change only accomidates this request


bindings/python/cntk/debugging/__init__.py (1 line):
	- line 62: if axis.name == "UnknownAxes":  # TODO: what is the correct way of testing this?


Source/ComputationNetworkLib/ComputationEnvironment.h (1 line):
	- line 30: // TODO: change the data member names back to m_ syntax, or get team consensus to not do that


Source/Readers/ImageReader/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Readers/ReaderLib/ReaderUtil.cpp (1 line):
	- line 15: if (!_wcsicmp(randomizeString.c_str(), L"none")) // TODO: don't support case-insensitive option strings in the new reader


Source/Readers/CNTKTextFormatReader/CNTKTextFormatReader.h (1 line):
	- line 13: // TODO: Should be deprecated, use composite reader instead.


Source/Readers/SparsePCReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H`


Source/Readers/BinaryReader/BinaryReader.h (1 line):
	- line 476: typedef std::string LabelType; // TODO: are these supposed to be the same as the DataReader's?


Source/Math/GPUDataTransferer.h (1 line):
	- line 124: // TODO: this needs to be refactored to get rid of all statics


bindings/python/cntk/contrib/deeprl/agent/shared/qlearning_parameters.py (1 line):
	- line 22: # TODO: validate parameter values.


Source/1BitSGD/V2AllReduceDistGradAggregator.h (1 line):
	- line 237: // TODO: Should be async


Source/Readers/HTKDeserializers/HTKChunkDescription.h (1 line):
	- line 20: // TODO: We should consider splitting data load from the description in the future versions.


bindings/python/cntk/contrib/deeprl/agent/qlearning.py (1 line):
	- line 105: # TODO: allow user to specify learner through config file.


Source/Readers/LibSVMBinaryReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


Source/ActionsLib/NDLNetworkBuilder.h (1 line):
	- line 325: // TODO JC Refactor eligible methods and members into abstract base class.


Source/CNTKv2LibraryDll/stdafx.h (1 line):
	- line 17: // TODO: reference additional headers your program requires here


Source/CNTKv2LibraryDll/API/Internals/PrimitiveFunction.h (1 line):
	- line 687: // TODO: Reconcile this with the ComputationNode::Validate functionality in core CNTK to avoid duplication of inference logic


Source/Readers/ReaderLib/Index.cpp (1 line):
	- line 55: // TODO: the sum of sizes does not account for a possible gap before the sequence offset.


Source/ComputationNetworkLib/stdafx.h (1 line):
	- line 19: // TODO: reference additional headers your program requires here


Source/Readers/ReaderLib/DataDeserializer.h (1 line):
	- line 11: // TODO: CNTKLibrary.h should be cleaned up to allow header only dependencies.


bindings/python/cntk/debugging/debug.py (1 line):
	- line 199: # TODO:


Source/Common/Include/ASGDHelper.h (1 line):
	- line 17: // TODO: We can removed these options once we can adjust learning rate at minibatches level


bindings/python/cntk/internal/sanitize.py (1 line):
	- line 122: # TODO: check whether Values can be ingested directly


Source/Readers/BinaryReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


Source/Readers/CNTKTextFormatReader/TextParser.h (1 line):
	- line 22: // TODO: more details when tracing warnings


Source/Readers/CompositeDataReader/stdafx.cpp (1 line):
	- line 11: // TODO: reference any additional headers you need in STDAFX.H


bindings/java/Swig/post-build.cmd (1 line):
	- line 22: rem: TODO: add check whether javac/jar exist.


Source/DelayLoadedExtensionsDll/ImageWriter.h (1 line):
	- line 25: // TODO: Fix CNTKLibrary.h and CNTKLibraryInternals.h for CNTK_HEADERONLY_DEFINITIONS.


Source/Readers/CNTKBinaryReader/BinaryDataDeserializer.h (1 line):
	- line 73: // TODO:


Source/CNTKv2LibraryDll/DistributedCommunicator.h (1 line):
	- line 88: // TODO: these two are always parallel, merge them together?


Source/Common/ExceptionWithCallStack.cpp (1 line):
	- line 166: // TODO: WE SHOULD REMOVE THIS HACK ASAP.


Source/Common/Include/fileutil.h (1 line):
	- line 886: // ... TODO: handle long NT paths


Source/DelayLoadedExtensionsDll/ImageWriter.cpp (1 line):
	- line 16: // time so rpath will apply (Linux). // TODO find a better way


Source/CNTKv2LibraryDll/proto/CNTK.proto (1 line):
	- line 64: // TODO: bool read_only = 8;


CNTK.Cpp.props (1 line):
	- line 147: <!-- TODO warn if ConfigurationType not (yet) defined -->


Source/CNTKv2LibraryDll/Common.cpp (1 line):
	- line 732: // TODO: alternatively, print a warning and return false.


Source/Readers/Kaldi2Reader/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Common/Include/ssefloat4.h (1 line):
	- line 38: // TODO: In the future, we should provide a NEON based implementation instead.


Source/Readers/LMSequenceReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


Source/Readers/ReaderLib/FileWrapper.h (1 line):
	- line 9: #pragma warning(disable : 4996)   // ^^ this does not seem to work--TODO: make it work


Source/Readers/UCIFastReader/UCIParser.cpp (1 line):
	- line 378: void UCIParser<NumType, LabelType>::ParseInit(LPCWSTR/*TODO: change to C++ type*/ fileName, size_t startFeatures, size_t dimFeatures, size_t startLabels, size_t dimLabels, size_t bufferSize, size_t startPosition)


Source/Readers/HTKMLFReader/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/CNTKv2LibraryDll/API/HalfConverter.hpp (1 line):
	- line 8: // TODO: use f16c instructions if available


Source/CNTKv2LibraryDll/proto/onnx/patch/onnxruntime/core/session/onnxruntime_c_api.cc (1 line):
	- line 466: //TODO: test if it's a string tensor


Source/Math/QuantizedOperations.h (1 line):
	- line 66: // TODO: replace with an efficient version, e.g. IPG, block multiplier, Eigen, gemmlowp, etc.


bindings/python/cntk/internal/utils.py (1 line):
	- line 68: # TODO figure out a better/faster way.


Source/Readers/ReaderLib/SequenceEnumerator.h (1 line):
	- line 46: // TODO: should be deprecated.


Source/ActionsLib/Actions.h (1 line):
	- line 24: using namespace Microsoft::MSR::CNTK; // TODO: we should not have this in a header


Source/ComputationNetworkLib/TrainingNodes.cpp (1 line):
	- line 188: // TODO: Should we prepare the CSC data directly on the CPU and move it in one go?


Source/Readers/ImageReader/Base64ImageDeserializer.cpp (1 line):
	- line 24: // TODO: Could probably be a memory mapped region.


Source/Readers/Kaldi2Reader/readaheadsource.h (1 line):
	- line 272: } // TODO: no, use our own time measurement


Source/Readers/HTKDeserializers/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/ComputationNetworkLib/LinearAlgebraNodes.cpp (1 line):
	- line 14: // TODO: can this be static?


bindings/csharp/CNTKLibraryManagedDll/ShimApiClasses/StreamConfigurationShim.cs (1 line):
	- line 13: /// TODO: do we need to handle special dimension values


Source/CNTKv2LibraryDll/tensorboard/TensorBoardUtils.cpp (1 line):
	- line 111: // TODO: set attrs["value"] for Constant - how to get the value?


Source/Math/Helpers.h (1 line):
	- line 6: // TODO: the file's name is too general to be included from outside; MathHelpers.h?


Scripts/install/windows/ps/Modules/Disk/Disk.psm1 (1 line):
	- line 47: # TODO: this is not concurrency safe. Another job could use a directory we are trying to remove ...


Source/Readers/CNTKBinaryReader/BinaryChunkDeserializer.h (1 line):
	- line 86: // TODO: more details when tracing warnings


Source/Readers/SparsePCReader/SparsePCReader.cpp (1 line):
	- line 306: // TODO: They are not really temporally ordered, so a better way would be to use tensors, once that is ready.


Source/ComputationNetworkLib/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Math/ConvolutionEngine.cpp (1 line):
	- line 959: //TODO: test code for linking with mkldnn.dll, will extend to support dilated convolution with MKL-DNN later


Source/Readers/Kaldi2Reader/HTKMLFWriter.cpp (1 line):
	- line 93: // TODO: The format specifier should probably be "%ls" here, but I'm not making that change as part of


Source/Readers/ReaderLib/SequencePacker.cpp (1 line):
	- line 209: // TODO: make type casts members of the SparseSequenceData


Source/Readers/ImageReader/ImageTransformers.h (1 line):
	- line 37: // TODO: This is potentially an expensive operation. Need to do some logging.


Source/Readers/DSSMReader/DSSMReader.cpp (1 line):
	- line 326: Matrix<ElemType>& labels    = matrices.GetInputMatrix<ElemType>(m_labelsName); // will change this part later.  TODO: How?


Source/Readers/ReaderLib/IndexBuilder.cpp (1 line):
	- line 116: // TODO: add TryRename that does not throw.


Source/SGDLib/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Readers/BinaryReader/BinaryWriter.cpp (1 line):
	- line 164: // TODO: sanity check and use records as a clue of how big to make it


Source/Readers/ReaderLib/CudaMemoryProvider.h (1 line):
	- line 15: /// TODO: Memory provider should reside on the matrix. It is responsibility of the network


Source/Extensibility/EvalWrapper/EvalExtendedWrapper.cpp (1 line):
	- line 119: // TODO: Should it have a read-only StorageType property?


Source/Math/MatrixQuantizerGPU.h (1 line):
	- line 3: #include "QuantizedMatrix.h" // TODO: strangely, this must be included first, although it is the first thing MatrixQuantizer.h includes. Without, nvcc fails.


Source/Math/ConvolveGeometry.h (1 line):
	- line 24: // TODO: rename to ConvolutionGeometry


Source/SGDLib/SimpleDistGradAggregator.h (1 line):
	- line 348: // TODO: we need a CopyGPUToCPUSync


Source/Readers/DSSMReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


bindings/python/cntk/logging/graph.py (1 line):
	- line 282: # TODO: Would be cool, if the user could pass a dictionary with overrides. But maybe for a later version.


Source/Readers/ReaderLib/NoRandomizer.cpp (1 line):
	- line 233: // TODO: This will be changed, when we move transformers under the (no-) randomizer, should not deal with multithreading here.


Source/Common/Include/latticesource.h (1 line):
	- line 49: #ifndef NONUMLATTICEMMI // TODO:set NUM lattice to null so as to save memory


Source/Readers/HTKDeserializers/MLFBinaryDeserializer.h (1 line):
	- line 22: // TODO: Should be removed, when all readers go away, expects configuration in a legacy mode.


Source/Readers/ImageReader/ImageDataDeserializer.h (1 line):
	- line 28: // TODO: This constructor should be deprecated in the future. Compositional config should be used instead.


Source/Readers/CNTKBinaryReader/stdafx.cpp (1 line):
	- line 12: // TODO: reference any additional headers you need in STDAFX.H


Source/Readers/CNTKBinaryReader/stdafx.h (1 line):
	- line 21: // TODO: reference additional headers your program requires here


Source/Readers/ImageReader/ImageDataDeserializer.cpp (1 line):
	- line 69: // TODO: Should be removed at some point.


Source/Readers/ImageReader/stdafx.h (1 line):
	- line 21: // TODO: reference additional headers your program requires here


Source/CNTK/tests.cpp (1 line):
	- line 271: if (configParam.Exists("learningRateMultiplier")) // TODO: should this be a test for 'true' rather than Exists()?


Source/CNTKv2LibraryDll/API/CNTKLibraryExperimental.h (1 line):
	- line 155: /// TODO: Currently chunk->SequenceInfos() == deserializer->SequenceInfo(chunk),


Source/Readers/Kaldi2Reader/DataReader.cpp (1 line):
	- line 7: // TODO: Rename to Exports.cpp


Source/Readers/Kaldi2Reader/simplethread.h (1 line):
	- line 20: class signallingevent // TODO: should this go into basetypes.h?


Source/Readers/HTKMLFReader/chunkevalsource.h (1 line):
	- line 12: #include "ssematrix.h" // TODO: why can it not be removed for Windows as well? At least needs a comment here.


Tools/cmake/cpp_common.cmake (1 line):
	- line 147: # TODO: -Werror                                                                                 # Treat all warnings as errors.


Source/Readers/ReaderLib/ReaderBase.cpp (1 line):
	- line 40: // TODO: In case when the network requires less inputs,


Source/Readers/ReaderLib/ReaderShim.h (1 line):
	- line 57: // TODO: if the prefetch is still valid, print a warning here!


Source/Math/GPUMatrix.h (1 line):
	- line 670: // TODO: This is now ignored on input, so we can should change to an empty string. This might break parsing, and must be tested first


Source/Readers/LUSequenceReader/Exports.cpp (1 line):
	- line 22: // TODO: codecvt should be supported in the latest gcc,


Source/Readers/LUSequenceReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


Source/Common/Config.cpp (1 line):
	- line 123: // TODO: This message is written to stderr before stderr gets redirected to the specified file.  Fix this.


Source/CNTKv2LibraryDll/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


Source/CNTKv2LibraryDll/EvaluatorWrapper.cpp (1 line):
	- line 77: // TODO: Avoid copying.


bindings/common/CNTKWarnFilters.i (1 line):
	- line 27: // TODO: it is not clear how to limit this only to hash, but we do not use partial specialization in other places.


Source/CNTK/ModelEditLanguage.cpp (1 line):
	- line 18: // TODO: Allowing partial matches seems misguided. We should discourage that, or just remove it.


Source/Readers/LUSequenceReader/LUSequenceParser.h (1 line):
	- line 31: unsigned flags; // flags that apply to this sequence   --TODO: We really need to know at least what those flags are, if an enum is asking for too much.


Source/Readers/HTKDeserializers/stdafx.h (1 line):
	- line 23: // TODO: reference additional headers your program requires here


Source/ComputationNetworkLib/MatrixPool.h (1 line):
	- line 243: // TODO: Make this a runtime option.


Source/Readers/HTKMLFReader/minibatchsourcehelpers.h (1 line):
	- line 194: // TODO: This is currently being hardcoded to unsigned short for saving space, which means untied context-dependent phones


Source/Readers/HTKDeserializers/MLFDeserializer.cpp (1 line):
	- line 40: // TODO: Should be removed. Currently a lot of end to end tests still use this one.


Source/Readers/LUSequenceReader/LUSequenceParser.cpp (1 line):
	- line 67: ch = trim(ch); // TODO: operates in-place, so no need to assign back


Source/Readers/ReaderLib/ChunkRandomizer.cpp (1 line):
	- line 51: // TODO: in case of the chunk-based randomization window, we couldn't care less


Source/Readers/UCIFastReader/stdafx.cpp (1 line):
	- line 7: // TODO: reference any additional headers you need in STDAFX.H


bindings/python/cntk/tensor.py (1 line):
	- line 86: # TODO __xor__, __rxor__, __pow__, __rpow__,  __invert__


Source/Readers/ReaderLib/CorpusDescriptor.h (1 line):
	- line 17: // TODO: Extract an interface.


Source/Readers/CNTKTextFormatReader/stdafx.h (1 line):
	- line 21: // TODO: reference additional headers your program requires here


Source/CNTKv2LibraryDll/proto/onnx/Operators.cpp (1 line):
	- line 85: // TODO: set key as BatchNormalization instead of BatchNormalizationCaffe