<center>Photo by <a style="background-color:black;color:white;text-decoration:none;padding:4px 6px;font-family:-apple-system, BlinkMacSystemFont, "San Francisco", "Helvetica Neue", Helvetica, Ubuntu, Roboto, Noto, "Segoe UI", Arial, sans-serif;font-size:12px;font-weight:bold;line-height:1.2;display:inline-block;border-radius:3px" href="https://unsplash.com/photos/lWYUA42UmL8" target="_blank" rel="noopener noreferrer" title="Download free do whatever you want high-resolution photos from Joshua Earle"><span style="display:inline-block;padding:2px 3px"><svg xmlns="http://www.w3.org/2000/svg" style="height:12px;width:auto;position:relative;vertical-align:middle;top:-2px;fill:white" viewBox="0 0 32 32"><title>unsplash-logo</title><path d="M10 9V0h12v9H10zm12 5h10v18H0V14h10v9h12v-9z"></path></svg></span><span style="display:inline-block;padding:2px 3px">Huper Earle</span></a></center>
Origin: YOLOv3: An Incremental Improvement
Improvement
1. New structure
<div class="gallery" data-columns="1"> <img src="/images/Paper/YOLOv3/YOLOv3_Arch.jpg"> <img src="/images/Paper/YOLOv3/new_structure.JPG"> </div>
2. Mutiscale Structure
3 scales and 3 anchors per scale per grid:
- small scale (13 x 13) ——> large anchor
- mid scale (26 x 26) ——> medium anchor
- large scale (52 x 52) ——> small anchor
3. Change Classfication
- 80 classes, from softmax ——> logistic
Using a softmax imposes the assumption that each box has
exactly one class which is often not the case. A multilabel
approach better models the data.
4. Use FPN
Summary
Output
- 13 x 13 x 3 * (4+1 + 80)
- 26 x 26 x 3 * (4+1 + 80)
- 52 x 52 x 3 * (4+1 + 80)