HOME/Articles/

YOLOv3

Article Outline

image

<center>Photo by <a style="background-color:black;color:white;text-decoration:none;padding:4px 6px;font-family:-apple-system, BlinkMacSystemFont, &quot;San Francisco&quot;, &quot;Helvetica Neue&quot;, Helvetica, Ubuntu, Roboto, Noto, &quot;Segoe UI&quot;, Arial, sans-serif;font-size:12px;font-weight:bold;line-height:1.2;display:inline-block;border-radius:3px" href="https://unsplash.com/photos/lWYUA42UmL8" target="_blank" rel="noopener noreferrer" title="Download free do whatever you want high-resolution photos from Joshua Earle"><span style="display:inline-block;padding:2px 3px"><svg xmlns="http://www.w3.org/2000/svg" style="height:12px;width:auto;position:relative;vertical-align:middle;top:-2px;fill:white" viewBox="0 0 32 32"><title>unsplash-logo</title><path d="M10 9V0h12v9H10zm12 5h10v18H0V14h10v9h12v-9z"></path></svg></span><span style="display:inline-block;padding:2px 3px">Huper Earle</span></a></center>

Origin: YOLOv3: An Incremental Improvement

Improvement

1. New structure

<div class="gallery" data-columns="1"> <img src="/images/Paper/YOLOv3/YOLOv3_Arch.jpg"> <img src="/images/Paper/YOLOv3/new_structure.JPG"> </div>

2. Mutiscale Structure

3 scales and 3 anchors per scale per grid:

  • small scale (13 x 13) ——> large anchor
  • mid scale (26 x 26) ——> medium anchor
  • large scale (52 x 52) ——> small anchor image

3. Change Classfication

  • 80 classes, from softmax ——> logistic

Using a softmax imposes the assumption that each box has exactly one class which is often not the case. A multilabel approach better models the data.

4. Use FPN

image

Summary

Output

  • 13 x 13 x 3 * (4+1 + 80)
  • 26 x 26 x 3 * (4+1 + 80)
  • 52 x 52 x 3 * (4+1 + 80)