deeplearning

deeplearning / IllustrationTagger / 0.4.0

README.md

1. Introduction

Illustration Tagger is a classification algorithm that tries to give you the best possible matching tags possible. It extracts features from the image and find the similarity between categories of images.

note: as of version 0.3.1, the output json can be collapsed by setting the 'collapse_output' key to true.

Input:

  • (Required) Image Data API Url' Web (http/https) Url /binary image or a base64 encoded JPEG/GIF/PNG String
  • (Optional) Confidence interval threshold* (default threshold=0.2)
  • (Optional) Desired tags*
  • (Optional) collapse_output

***** You cannot provide both the threshold & desired tags at the same time.

************* **** As of version 0.2.0 base64 encoded JPEGs strings are a valid image input format; 0.2.5 adds GIF/PNG

Output:

  • Tagging categories: (When desired tags are not given)

  • rating

  • character

  • copyright

  • general

  • Confidence for desired tags.

Note: The first call to this algorithm will take a bit longer than sequential calls to due algorithm initialization. All following calls will be significantly faster.

2. How Tagging Works

The algorithm looks at the given image and calculates the similarity between the image and available tags. It returns 4 categories of predicted tags when available.

  1. The rating category tells you how NSFW the image is. It gives confidence intervals of how safe, questionable or explicit the image is.
  2. The character category tries to guess the character the person is in the photo.
  3. The copyright category tries to predict who owns the copyright to to corresponding image. Because the model was trained on a Japanese Illustration dataset, it may not return for all other copyright owners.
  4. The general category has several hundred tags. This category gives you a good understanding of what tag your image belongs to.

Alternatively if you want to only get the confidence values for specific tags you can pass the tags parameter to the algorithm. The algorithm will only return confidence for the specified tags only.

A list of all available tags can be viewed here.

3. Examples

Example 1.

  • Parameter 1: Data API Url
{
    "image": "data://deeplearning/example_data/trudeau.jpg"
}

Output

{
  "rating": [
    {"safe": 0.9924461245536804},
    {"questionable": 0.006749290972948074},
    {"explicit": 0.0001874923618743196}
  ],
  "character": [],
  "copyright": [{"real life": 0.38196513056755066}],
  "general": [
    {"1boy": 0.93906170129776},
    {"solo": 0.9158311486244202},
    {"male": 0.6592674255371094},
    {"black hair": 0.42818406224250793},
    {"necktie": 0.23634621500968933},
    {"formal": 0.23075371980667114}
  ]
}

Example 2.

  • Parameter 1: HTTP Url
{
    "image": "https://upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Trudeaujpg.jpg/348px-Trudeaujpg.jpg"
}

Output:

{
  "rating": [
    {"safe": 0.9924461245536804},
    {"questionable": 0.006749290972948074},
    {"explicit": 0.0001874923618743196}
  ],
  "character": [],
  "copyright": [{"real life": 0.38196513056755066}],
  "general": [
    {"1boy": 0.93906170129776},
    {"solo": 0.9158311486244202},
    {"male": 0.6592674255371094},
    {"black hair": 0.42818406224250793},
    {"necktie": 0.23634621500968933},
    {"formal": 0.23075371980667114}
  ]
}

Example 3.

  • Parameter 1: Data API Url
  • Parameter 2: Threshold (Value=0.5)
{
    "image": "data://deeplearning/example_data/trudeau.jpg",
    "threshold": 0.1
}

Output;

{
  "rating": [
    {"safe": 0.9924461245536804},
    {"questionable": 0.006749290972948074},
    {"explicit": 0.0001874923618743196}
  ],
  "character": [],
  "copyright": [{"real life": 0.38196513056755066}],
  "general": [
    {"1boy": 0.93906170129776},
    {"solo": 0.9158311486244202},
    {"male": 0.6592674255371094},
    {"black hair": 0.42818406224250793},
    {"necktie": 0.23634621500968933},
    {"formal": 0.23075371980667114},
    {"suit": 0.1762586236000061},
    {"black eyes": 0.1671484112739563},
    {"smile": 0.15806205570697784},
    {"facial hair": 0.15715354681015015},
    {"brown hair": 0.15507465600967407},
    {"short hair": 0.1326742023229599},
    {"cosplay": 0.12354501336812973},
    {"bust": 0.11704712361097336},
    {"photo": 0.10584855079650879}
  ]
}

Example 4.

  • Parameter 1: HTTP Url
  • Parameter 2: Desired tags ("1boy", "male", "safe", "sky")
{
    "image": "https://upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Trudeaujpg.jpg/348px-Trudeaujpg.jpg",
    "tags": ["1boy", "male", "sky", "water", "safe"]
}

Output:

{
  "all_tags": [
    {"water": 0.0006849411875009537},
    {"1boy": 0.93906170129776},
    {"safe": 0.9924461245536804},
    {"male": 0.6592674255371094},
    {"sky": 0.0015039071440696716}
  ]
}

Example 5.

  • Parameter 1: Data API Url
  • Parameter 2: Threshold (Value=0.5)
  • Parameter 3: Collapse output provided
{
    "image": "data://deeplearning/example_data/trudeau.jpg",
    "threshold": 0.1,
		"collapse_output": true
}

Output:

[  
  {  
    "confidence":0.9924461245536804,
    "label":"safe"
  },
  {  
    "confidence":0.9390615820884703,
    "label":"1boy"
  },
  {  
    "confidence":0.9158311486244202,
    "label":"solo"
  },
  {  
    "confidence":0.6592671275138855,
    "label":"male"
  },
  {  
    "confidence":0.4281844198703766,
    "label":"black hair"
  },
  {  
    "confidence":0.3819650411605835,
    "label":"real life"
  },
  {  
    "confidence":0.2363460510969162,
    "label":"necktie"
  },
  {  
    "confidence":0.2307535260915756,
    "label":"formal"
  },
  {  
    "confidence":0.17625851929187775,
    "label":"suit"
  },
  {  
    "confidence":0.16714857518672943,
    "label":"black eyes"
  },
  {  
    "confidence":0.15806202590465546,
    "label":"smile"
  },
  {  
    "confidence":0.15715324878692627,
    "label":"facial hair"
  },
  {  
    "confidence":0.15507452189922333,
    "label":"brown hair"
  },
  {  
    "confidence":0.13267424702644348,
    "label":"short hair"
  },
  {  
    "confidence":0.12354494631290436,
    "label":"cosplay"
  },
  {  
    "confidence":0.11704707890748978,
    "label":"bust"
  },
  {  
    "confidence":0.10584864020347597,
    "label":"photo"
  },
  {  
    "confidence":0.006749283988028765,
    "label":"questionable"
  },
  {  
    "confidence":0.0001874927111202851,
    "label":"explicit"
  }
]

4. Credits

For more information, please refer to http://illustration2vec.net/ or Saito, Masaki and Matsui, Yusuke. (2015). Illustration2Vec: A Semantic Vector Representation of Illustrations. SIGGRAPH Asia Technical Briefs. 2015.

Demo image was retrieved from :http://flickr.com/photos/25480181@N06/4929681007 CC BY License used.