In order to illustrate the usability of CREX we describe here a real world use case scenario.
Suppose that we have a huge dataset of unsorted tweets belonging to various knowledge and
interest
domains. Say, sports and politics. Suppose now that we own a limited budget and that we want to
run two different tweet labeling tasks with different instruction and output types depending on
the type of the tweet. For instance:
For sports related tweets - task 1 :
For politics related tweets - task 2 :
One way of achieving this, is to manually label the tweets to split them into three categories:
sport and politics and others. Sample the tweets to a number that respect the available budget
and than build the two tasks and publish them. What CREX allows to do is to automatically
achieve this by grouping the task through clustering and then applying a constrained sampling
and finally by using the requester's input to generate the task from the sampled tweets.