Data browser extension

Version 1.6.0 of refinery introduces changes to the data browser to provide enhanced filtering capabilities for attributes. It is now also possible to rename labels in the whole project and change the source code in which labels are used with one click. The refinery demo application was also overhauled in order to achieve higher accessibility and cover multiple different roles. Various bugs have been addressed in this version of refinery as well.

Data browser enhancements

The data browser is one of the main features of refinery, allowing you to slice and filter data in many ways. In previous versions, the attribute filters were limited to a few operators, such as CONTAINS or EQUAL. With these operators, it was already possible to do a lot, but there were still many cases in which the previous approach to attribute filtering was not sufficient. The new version offers a lot more in depths ways to filter attributes.

For example, there is now the option to use wildcards for filtering. Wildcards can be used as a placeholder for any value in order to cover a wider range of variables or to use them if the actual target value is not known. In the image below you can see a wildcard used to get all the two-digit IDs starting with 1 and all starting with 4. We can use the "_" (or ?) as a wildcard for a single character and % (or *) for anything following. The filter then returns all records matching the running IDs 10 to 19 or everything starting with 4 (so 4, 40, 41, ... 400, 401,...).

The following operators have been added: IN, IN WC (wildcard), BETWEEN, GREATER, GREATER EQUAL, LESSER and LESSER EQUAL.

It is now also possible to set the filter to case-sensitive. Searching for "EU" previously returned all words containing "eu" (such as the word "lineup" for example). Activating the case sensitivity now only returns results that match the provided casing.

Label renaming

The names of labels can now be changed on the settings page. In previous versions, the names of labels were immutable. This was partly because other parts of the project, such as label functions or lookup lists, were tied to the labels by name. When changing the name of a label, a prompt now opens, allowing you to change all the label names in the source code of label functions, active learners or lookup lists with just one click.

Please keep in mind that we provide a "best guess" for the changes. Since e.g. python code is very versatile some changes might not be what you intended.

In this example, we check for a word that matches the label "by accident" which would result in code that doesn't provide the previous functionality. So before changing something keep in mind to double-check the provided helper.

Demo application overhaul

The demo application, which can be found at demo.kern.ai, was also given multiple updates. The login screen is now more accessible and allows you to select from different roles before entering the application. You may now choose between the roles:

  • Engineer: Access to all features in refinery.
  • Expert: Access to the labeling view only.
  • Annotator: Access to a task-minimized labeling view only

To further highlight the different abilities of the roles, the demo application now also features multiuser example projects. The example projects are similar to the example projects which could previously be found in refinery. They are enriched with multiuser functionalities such as crows heuristics or specific data slices for the labeling view.

The demo application hosting provider was also moved from AWS to DigitalOcean. However, this should have no impact on the actual experience of the user.

Beta: Lite import assistant for Label Studio exports

In the previous version, the export into Label Studio was introduced, providing the fitting format to export data from refinery into Label Studio. Version 1.6.0 of refinery now offers a beta functionality to import data from Label Studio itself into our application. Being a beta feature, this has many restrictions and is only usable for binary classification projects.

Please let us know if you want this feature to be expanded to more project types.

Minor changes

  • A redundant second cancel button was removed from the project creation page.
  • After project deletion now the last 5 notifications are kept (instead of only 1)
  • The overview of all the downloaded models has no project connection anymore. Issue #45
  • Hovering over filter slices with long names now reveals their full name. Issue #178
  • The scaling on the confident charts is now showing a fixed max value. Issue #139
  • Fixed a bug in which blacklisted terms still showed up in lookup lists. Issue #170
  • Fixed a bug where the label distribution chart hover values were multiplied by 100. Issue #167
  • A lot of micro changes (e.g. icons to differentiate heuristics, deletion icon color, ...)