The input convex neural network is a scalar-valued neural network. This output is a scalar value, whereas the inputs are a feature vector and a label vector. The output value is a convex function of the label vector. The amazing part of this work is that it changes a classification problem to an optimization problem. To do a classification, one should optimize the input label to get the greatest output scalar. Since it is a convex function, the optimization is efficient and accurate. The advantage of doing so is that it can help the model discover latent relations between features and labels.
When applying this model to reinforcement learning, we use input convex neural network to model the Q function. We consider the environment state as the feature input and the current action as the label input. Taking advantage of the nice convex character, we can easily find the optimal action in a continuous space.