-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
921 lines (888 loc) · 72 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
【Approximate/スパイン/フィラメントシリーズ】 2020/12/20
★ssim-20221020 … SORT-MERGE(new MEX + conv-c2c)対応
SHA256(OP_MAJ,OP_CH,OP_ROTS),FFT(conv-c2c)対応
filter/filter-zynq.emax6+dma OK
mm_cnn_lf/mm-zynq.emax6+dma OK
mm_cnn_lf/cnn-zynq.emax6+dma OK
mm_cnn_lf/gdepth-zynq.emax6+dma OK
mm_cnn_lf/inv-zynq.emax6+dma OK
mm_cnn_lf/gather-zynq.emax6+dma OK
stencil/stencil-zynq.emax6+dma OK
stringsearch/search-zynq.emax6+dma OK
test/test021-zynq.emax6+dma OK
test/test022-zynq.emax6+dma OK
test/test024-zynq.emax6+dma OK
crypto/sha256-zynq.emax6+dma OK
fft/fft-zynq.emax6+dma OK
ssim/ssim-zynq.emax6+dma OK
sort/sort-merge-zynq.emax6+dma OK
★ssim-20220720 … FTパッチ削除(dmrp_statをLMRING_BUSYに追加)emax6lib.c変更
★ssim-20220620 … cnn層パラメタ変更
./ssim-cent -r -i -I0 -C1 -F1 err=0.0130○
./ssim-cent -r -i -I0 -C1 -F2 err=0.0362
./ssim-cent -r -i -I0 -C1 -F3 err=0.0394
./ssim-cent -r -i -I0 -C2 -F1 err=0.0099○
./ssim-cent -r -i -I0 -C3 -F1 err=0.0078○
./ssim-cent -r -i -I0 -V3 -C1 -F1 err=0.0365, -S err=0.0629
./ssim-cent -r -i -I0 -V3 -C1 -F2 err=0.0616, -S err=0.2218
./ssim-cent -r -i -I0 -V3 -C1 -F3 err=0.0651, -S err=0.8592
./ssim-cent -r -i -I0 -V3 -C2 -F1 err=0.0147○
./ssim-cent -r -i -I0 -V3 -C3 -F1 err=0.0104○
./ssim-cent -r -i -I1 -C1 -F2 err=0.2989
./ssim-cent -r -i -I1 -C1 -F3 err=0.2628
./ssim-cent -r -i -I1 -C4 -F1 err=0.2160◎
./ssim-cent -r -i -I1 -C6 -F2 err=0.2023◎
./ssim-cent -r -i -I1 -V2 -C1 -F1 err=0.4543, -S err=0.7337
./ssim-cent -r -i -I1 -V2 -C1 -F2 err=0.4296, -S err=0.8323
./ssim-cent -r -i -I1 -V2 -C1 -F3 err=0.3967, -S err=0.8591
★ssim-20220610 … WEIGHTLIMITにより-1.0~1.0に制限
./ssim-cent -r -i -I0 -C1 -F1 err=0.0130○
./ssim-cent -r -i -I0 -C1 -F2 err=0.0362
./ssim-cent -r -i -I0 -C1 -F3 err=0.0759
./ssim-cent -r -i -I0 -C2 -F1 err=0.0091○
./ssim-cent -r -i -I0 -C3 -F1 err=0.0083○
./ssim-cent -r -i -I0 -V3 -C1 -F1 err=0.0365, -S err=0.0629
./ssim-cent -r -i -I0 -V3 -C1 -F2 err=0.0616, -S err=0.2218★改善
./ssim-cent -r -i -I0 -V3 -C1 -F3 err=0.0653, -S err=0.8972
./ssim-cent -r -i -I0 -V3 -C2 -F1 err=0.0136○
./ssim-cent -r -i -I0 -V3 -C3 -F1 err=0.0103○
./ssim-cent -r -i -I1 -C1 -F2 err=0.2989
./ssim-cent -r -i -I1 -C1 -F3 err=0.2628
./ssim-cent -r -i -I1 -C4 -F1 err=0.2378
./ssim-cent -r -i -I1 -C6 -F2 err=0.2063◎
./ssim-cent -r -i -I1 -V2 -C1 -F1 err=0.4543, -S err=0.7337
./ssim-cent -r -i -I1 -V2 -C1 -F2 err=0.4296, -S err=0.8323
./ssim-cent -r -i -I1 -V2 -C1 -F3 err=0.3967, -S err=0.8990
★ssim-20220601 … WEIGHTLIMITにより-2.0~2.0に制限
./ssim-cent -r -i -I0 -C1 -F1 err=0.0130○
./ssim-cent -r -i -I0 -C1 -F2 err=0.0216
./ssim-cent -r -i -I0 -C1 -F3 err=0.0405
./ssim-cent -r -i -I0 -C2 -F1 err=0.0091○
./ssim-cent -r -i -I0 -C3 -F1 err=0.0083○
./ssim-cent -r -i -I0 -V3 -C1 -F1 err=0.0365, -S err=0.0629
./ssim-cent -r -i -I0 -V3 -C1 -F2 err=0.0424, -S err=0.3475★改善
./ssim-cent -r -i -I0 -V3 -C1 -F3 err=0.0432, -S err=0.8865
./ssim-cent -r -i -I0 -V3 -C2 -F1 err=0.0136○
./ssim-cent -r -i -I0 -V3 -C3 -F1 err=0.0103○
./ssim-cent -r -i -I1 -C1 -F2 err=0.2946
./ssim-cent -r -i -I1 -C1 -F3 err=0.2624
./ssim-cent -r -i -I1 -C4 -F1 err=0.2395
./ssim-cent -r -i -I1 -C6 -F2 err=0.2056◎
./ssim-cent -r -i -I1 -V2 -C1 -F1 err=0.4543, -S err=0.7337
./ssim-cent -r -i -I1 -V2 -C1 -F2 err=0.4272, -S err=0.8584
./ssim-cent -r -i -I1 -V2 -C1 -F3 err=0.3948, -S err=0.8626
★ssim-20220501 … DROPOUT追加(drop率20%)
./ssim-cent -r -i -I0 -C1 -F1 err=0.0130○
./ssim-cent -r -i -I0 -C1 -F2 err=0.0205
./ssim-cent -r -i -I0 -C1 -F3 err=0.0333
./ssim-cent -r -i -I0 -C2 -F1 err=0.0091○
./ssim-cent -r -i -I0 -C3 -F1 err=0.0083○
./ssim-cent -r -i -I0 -V3 -C1 -F1 err=0.0365, -S err=0.0629
./ssim-cent -r -i -I0 -V3 -C1 -F2 err=0.0410, -S err=0.8854
./ssim-cent -r -i -I0 -V3 -C1 -F3 err=0.0384, -S err=0.8865
./ssim-cent -r -i -I0 -V3 -C2 -F1 err=0.0136○
./ssim-cent -r -i -I0 -V3 -C3 -F1 err=0.0103○
./ssim-cent -r -i -I1 -C1 -F2 err=0.2960
./ssim-cent -r -i -I1 -C1 -F3 err=0.2645
./ssim-cent -r -i -I1 -C4 -F1 err=0.2395
./ssim-cent -r -i -I1 -C6 -F2 err=0.2056◎
./ssim-cent -r -i -I1 -V2 -C1 -F1 err=0.4543, -S err=0.7337
./ssim-cent -r -i -I1 -V2 -C1 -F2 err=0.4220, -S err=0.9006
./ssim-cent -r -i -I1 -V2 -C1 -F3 err=0.3906, -S err=0.9000
※大モデル内でアンサンブル学習するのではなく,80%モデルを多重学習
ssim-cent -t -I0 -C1 -F1 drp20% err=0.0130 25% 0.0131 30% 0.0127
ssim-cent -t -I0 -C3 -F1 drp20% err=0.0083 25% 0.0073 30% 0.0081
drp0% 0.2259 ssim-cent -t -I1 -C4 -F1 drp20% err=0.2395 25% 0.2307 30% 0.2244
drp0% 0.2168 ssim-cent -t -I1 -C6 -F2 drp20% err=0.2056 25% 0.2186 30% 0.2576
★ssim-20220301 … reg_ctrlからEMAX_DEPTHを取得
★ssim-20220220 … extract_1bit_edge:Laplacian 8-filter
ssim/ssim-zynq.emax6+dma -x -t -I1 -V2 -C4 -F1 TH=60 1.err=0.7230 2.err=0.6540 6.err=0.4218
ssim/ssim-cent -x -t -I1 -V2 -C6 -F2 TH=50 0.9000
TH=60 0.2583
TH=70 0.2609
TH=80 0.2657
TH=90 0.2681
TH=100 0.2658
-x -t -I0 -V3 -C1 -F1 TH=60 err=0.0365 96.4%
-x -t -I0 -V3 -C3 -F1 TH=60 err=0.0108 98.9%
★ssim-20220120 … conv-c2c-20220120で再コンパイル
★ssim-20211210 … exp=H3232,H1010追加版conv-c2cで再コンパイル
●ssim-20210920 … 疎行列機能追加版conv-c2cで再コンパイル
●ssim-20210801 … conv-c2c-20210801にてLMX(drain/load)高速化
●ssim-20210701 … emax6lib.cのsoftu64()をcomparator型に変更
ssim/ssim-zynq.emax6+dma
SPU_COUT_BITS 12 r4=3 -x -i -r -I0 -V3 -C1 -F1 -S err=0.0614 (H=50)
ssim/ssim-bsd.emax6nc
SPU_COUT_BITS 12 r4=3 -x -i -r -I0 -V3 -C1 -F1 -S err=0.0625
SPU_COUT_BITS 15 r4=3 -x -i -r -I0 -V3 -C1 -F1 -S err=0.0509
SPU_COUT_BITS 31 r4=3 -x -i -r -I0 -V3 -C1 -F1 -S err=0.0489
SPU_COUT_BITS 12 r4=3 -x -i -r -I1 -V2 -C1 -F1 -S err=0.7432
SPU_COUT_BITS 15 r4=3 -x -i -r -I1 -V2 -C1 -F1 -S err=0.7357
SPU_COUT_BITS 31 r4=3 -x -i -r -I1 -V2 -C1 -F1 -S err=0.6445
●ssim-20210620 … urand()追加
- ssim/ssim-cent.emax6nc -r -i -I0 -V3 -C1 -F1 -S err=0.06(16/32) err=0.04(12/32)
●ssim-20210607 … OP_SFMAをパイプライン化
- ssim/ssim-zynq.emax6+dma -t -I1 -C4 -F1 err=0.7368(01) err=0.2518(126)
●ssim-20210606 … OP_SFMAとOP_SFMAMAGを追加
●ssim-20210604 … spu.cをemax6lib.cに反映.spu.c削除.MNIST異常はRMGRPが大き過ぎた
●ssim-20210603 … LoadParam/StoreParamにhbias/obias追加.spikeでもhbias使用
●ssim-20210602 … UNARY8_FC IMAX2版 32/32に変更
- ssim/ssim-cent -r -i -I0 -V3 -C1 -F1 -S err=0.06(16/32) err=0.04(12/32)
●ssim-20210601 … UNARY8_FC IMAX初版
- ssim/ssim-cent -r -i -I0 -V3 -C1 -F1 -S err=0.50(64/32)
●ssim-20210520 … UNARY8_FC追加
- ssim/ssim-cent -r -i -I0 -V3 -C1 -F1 -S err=0.06(16/32) err=0.04(12/32)
●ssim-20210315 … データ分布も可視化
-t -r -i
0 0 0 CAMERA (random-W)
0 0 1 INFERENCE(random-W) one-shot
0 1 0 CAMERA (reuse-W)
0 1 1 INFERENCE(reuse-W) one-shot
1 0 0 TRAINING
1 0 1 TRAINING one-shot
1 1 0 TRAINING (reuse-W)
1 1 1 TRAINING (reuse-W) one-shot
●ssim-20210314 … MAGNI=[12]. convs16tof32を試す MAGNI=1 MAGNI=2
- ssim/ssim-cent -i -I0 -V3 -C1 -F1 -S: err=0.0337 -S err=0.0349 0.0351
- ssim/ssim-cent -i -I1 -V2 -C1 -F1 -S: err=0.4453 -S err=0.4508 0.4507
- ssim/ssim-cent -i -I1 -V2 -C1 -F2 -S: err=0.4179 -S err=0.5373 0.5291
- ssim/ssim-cent -i -I1 -V2 -C1 -F3 -S: err=0.3789 -S err=0.6314 0.5067
●ssim-20210313 … MAGNI=16.ReuseWeight(-r)追加 -oと-iのerr差はbias=0による
- ssim/ssim-cent -i -I0 -V3 -C1 -F1 -S: err=0.0337 -S err=0.0348
- ssim/ssim-cent -i -I1 -V2 -C1 -F1 -S: err=0.4453 -S err=0.4494
- ssim/ssim-cent -i -I1 -V2 -C1 -F2 -S: err=0.4179 -S err=0.5356
- ssim/ssim-cent -i -I1 -V2 -C1 -F3 -S: err=0.3789 -S err=0.5032
●ssim-20210312 … 旧-iを-Iに変更.-iは*.txtを使うInference_mode
●ssim-20210311 … SPIKING_FC nout振幅調整中
convs16tos8(&D, C, 6); // 6固定
- ssim/ssim-zynq.dma -t -I1 -C4 -F1 : 01 err=0.7368 64 err=0.2652
- ssim/ssim-cent -t -I0 -V3 -C1 -F1 -S: 01 err=0.0726 105 err=0.0338○
- ssim/ssim-cent -t -I1 -V2 -C1 -F1 -S: 01 err=0.6887 112 err=0.4469○
- ssim/ssim-cent -t -I1 -V2 -C1 -F2 -S: 01 err=0.9000 35 err=0.6042×
- ssim/ssim-cent -t -I1 -V2 -C1 -F3 -S: 01 err=0.9000 35 err=0.8305×
●ssim-20210310 … SPIKING_FC nout振幅調整中
wtが意外に小さい 32bit->8bitにオフセット必要か?
D10-40-ssim/ssim-cent -t -I0 -V3 -C1 -F1 -S: 01 err=0.0726 105 err=0.0338○
D10-30-ssim/ssim-cent -t -I1 -V2 -C1 -F1 -S: 01 err=0.6887 112 err=0.4469○
D10-50-ssim/ssim-cent -t -I1 -V2 -C1 -F2 -S: 01 err=0.9000 21 err=0.5579×
D10-70-ssim/ssim-cent -t -I1 -V2 -C1 -F3 -S: 01 err=0.9000 32 err=0.7579×
●ssim-20210309 … SPIKING_FC nout振幅調整中
- ssim/ssim-cent -t -I0 -V3 -C1 -F1 -S: 01 err=0.4616 40 err=0.1146×
- ssim/ssim-cent -t -I1 -V2 -C1 -F1 -S: 01 err=0.8608 119 err=0.4647○
- ssim/ssim-cent -t -I1 -V2 -C1 -F2 -S: 01 err=0.9000 100 err=0.4653○
- ssim/ssim-cent -t -I1 -V2 -C1 -F3 -S: 01 err=0.9000 57 err=0.7328×
●ssim-20210308 … SPIKING_FC準備中 #undef SPIKE8:従来方法(MNIST err=0.04)
#define SPIKE8:SPIKE8 (MNIST err=0.05)
Σ出力レベルを1/4にして err改善
出力レベルが大き過ぎた
- ssim/ssim-cent -t -I0 -V3 -C1 -F1 -S: 01 err=0.0623 94 err=0.0333
- ssim/ssim-cent -t -I1 -V2 -C1 -F1 -S: 01 err=0.6572 119 err=0.4563
- ssim/ssim-cent -t -I1 -V2 -C1 -F2 -S: 01 err=0.9000 35 err=0.5719
- ssim/ssim-cent -t -I1 -V2 -C1 -F3 -S: 01 err=0.9000 03 err=0.9000
●ssim-20210307 … SPIKING_FC準備中 ssim/ssim-bsd -x -t -I0 -V3 -C1 -F1 -S
PBL1-4:NCHIP/IMAP/OMAP=2/2/16 PBL1-5:NCHIP/IMAP/OMAP=2/2/16
- ssim/ssim-zynq.dma -t -I1 -C4 -F1 : 01 err=0.7368
●ssim-20210306 … PBL1-4:IMAP/OMAP=2/16 PBL1-5:IMAP/OMAP=2/16
IMXlen*Kが32K/4以下ならまとめてpload
●ssim-20210305 … SMAX1廃止 -Sオプション追加 (学習時は非spike,識別時はspike)
-oオプションがerr最低時の重みをファイル出力するよう変更
imax.cでfor(ic/oc) IC/OCチェック追加
Uint cc0[][] -> Ull cc0[][]に修正
- ssim/ssim-cent -t -I1 -V2 -C1 -F1 -S: 04 err=0.5219 123 err=0.4453
- ssim/ssim-cent -t -I1 -V2 -C1 -F2 -S: 04 err=0.5081 138 err=0.4179
- ssim/ssim-cent -t -I1 -V2 -C1 -F3 -S: 04 err=0.8685 126 err=0.3789
- ssim/ssim-cent -t -I1 -V2 -C1 -F3 : 04 err=0.8691 126 err=0.3801
- ssim/ssim-zynq.dma -t -I1 -C4 -F1 : 01 err=0.7224 28 err=0.3164
- ssim/ssim-cent -t -I1 -C4 -F1 : 01 err=0.6360 121 err=0.2259
- ssim/ssim-cent -t -I1 -C6 -F2 : 01 err=0.8070 132 err=0.2168
●ssim-20210304 … 旧SMAX1廃止 旧NMAX1を新SMAX1に変更
●ssim-20210302 … conv-c2c-20210302ではPBL1-5:IMAP=4,OMAP=8...OK
PBL1-4:2+8 1-5:2+8(arch28) ssim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.7589
PBL1-4:4+8 1-5:2+8(arch28) ssim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.7224
PBL1-4:2+8 1-5:4+8(arch29) ssim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.7271
PBL1-4:4+8 1-5:4+8(arch28) ssim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.7224
●ssim-20210301 … AXI<->LMM単位を64bitから32bitに変更 mop(len=4Bの倍数)
●ssim-20210224 … i,o->o,i
●ssim-20210223 … ic0[ic][oc],kp[ic][oc]->ic0[oc][ic],kp[oc][ic]
●ssim-20210222 … test020がPBL1-5:IMAP=4,OMAP=8で動作.反映.ただssimは収束せず
140MHz->100MHzに落しても同じ
●ssim-20210208 … ○nmax.c.SPIKING_FC整形のみ.簡略化できたので一旦凍結
●ssim-20210207 … ○nmax.c.NMORPHIC_FC作業中
*A -= fc_eta * ( wd * *A + *B);
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F3: 36/150 err=0.0400
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F2: 7/150 err=0.4841
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 5/150 err=0.8170
●ssim-20210206 … ○nmax.c.NMORPHIC_FC作業中
外に戻して *A -= fc_eta * ( wd * *A + *B);
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F3: 7/150 err=0.7477
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F2: 7/150 err=0.4841
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 5/150 err=0.8170
*A -= fc_eta * *B;では,
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F3: 37/150 err=0.0400
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F2: 85/150 err=0.4499
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 11/150 err=0.9000×
●ssim-20210205 … ○nmax.c.NMORPHIC_FC作業中
中に入れて *A -= (fc_eta * *B)/(batch_size/5);
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F3: 7/150 err=0.7478
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F2: 19/150 err=0.0409
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F2: 43/150 err=0.4994
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 7/150 err=0.9000
●ssim-20210204 … ○nmax.c.batch_sizeを最外に置くDIGITAL_FC完成
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F3: 7/150 err=0.7477
- ssim/ssim-cent.nmax1 -t -I0 -V3 -C1 -F2: 7/150 err=0.0590
- ssim/ssim-cent.nmax1 -t -I0 -V3 -C1 -F3: 7/150 err=0.7550
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 40/150 err=0.4347
- ssim/ssim-cent -t -I1 -V2 -C1 -F3: 126/150 err=0.3801
●ssim-20210203 … ○nmax.c作業中.nmac.cにRefとして従来コード取り込み,obias削除
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F2: 10/150 err=0.0507
- ssim/ssim-bsd.nmax1 -t -I0 -V3 -C1 -F3: 11/150 err=0.5328
- ssim/ssim-cent -t -I0 -V3 -C1 -F3: 109/150 err=0.0390
- ssim/ssim-cent -t -I0 -V3 -C1 -F4: 63/150 err=0.9972
●ssim-20210202 … ○nmax.c作業中.Spike可視化
- ssim-zynq.emax6+dma -t -I1 -C4 -F1: 1/150 err=0.7233
●ssim-20210201 … ○スパイク観測用に,Vectorwin追加
●ssim-20210112 … ○NMAX:作成途中
●ssim-20210111 … ○SMAX:学術変革,スパイクベース汎用計算(for張先生) Backprop有り予定
○NMAX:BLT,RNN構造のSpike-Stochastic(for木村先生) Backprop無し予定
●ssim-20210110 … V1:slit(8ch)+corner(8ch)
- ssim/ssim-cent.nmax1 -t -I0 -C1 -F1: 82/150 err=0.0117
- ssim/ssim-cent.nmax1 -t -I0 -V3 -C1 -F1: 77/150 err=0.0347
- ssim/ssim-cent.nmax1 -t -I1 -C4 -F1: 121/150 err=0.2281
- ssim/ssim-cent.nmax1 -t -I1 -C6 -F2: 150/150 err=0.2138
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F1: 124/150 err=0.4502
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F2: 95/150 err=0.4246
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 121/150 err=0.3779
- ssim/ssim-cent.nmax1 -t -I1 -V3 -C1 -F3: 82/150 err=0.3779
●ssim-20210101 … CIFAR10/BLTフィラメントspike simlモデル作成開始(-DNMAX1)
Makefile-cent.nmax1 … CNN無しスパイン版
bsd版はssimではメモリ不足,しかしssimベースとする
tensor.c:multiply_float2Dのみnmax.cに接続(NMAX1)
CNN不使用前提.なお-C(>1)指定はmultiply_float2Dに接続するので
結果的に全てnmax.cのsgemmにて実行
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F1: 121/150 err=0.4738
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F2: 121/150 err=0.4410
- ssim/ssim-cent.nmax1 -t -I1 -V2 -C1 -F3: 126/150 err=0.3907
●ssim-20201223 … epoch=150,ETA1=1.375/16,WD1=1.375/1024,40epoch毎/8
- ssim/ssim-cent -t -I1 -C6 -F2 (左右shiftのみ): iter=126 err=0.2244
- ssim/ssim-cent -t -I1 -C4 -F1 (左右shiftのみ): iter=122 err=0.2418
- ssim/ssim-cent.emax6nc V2 -C4 -F1 (左右shiftのみ): iter= 4 err=0.5758
- ssim/ssim-cent.emax6nc V3 -C4 -F1 (左右shiftのみ): iter= 5 err=0.5585
●ssim-20201222 … epoch=150,ETA初期値1.25/16に変更,WD=0.00001f固定
- ssim/ssim-cent -t -I1 -C6 -F2 (左右shiftのみ): iter=149 err=0.2584
- ssim/ssim-cent -t -I1 -C4 -F1 (左右shiftのみ): iter=109 err=0.2604
●ssim-20201221 … epoch=200,flip+左右shift.softf8()でもV3なら学習可能
- ssim/ssim-cent -t -I1 -C6 -F2 (左右shiftのみ): iter=157 err=0.2252
- ssim/ssim-cent -t -I1 -C4 -F1 (左右shiftのみ): iter=194 err=0.2418
- ssim/ssim-cent.emax6nc -C4 -F1 (左右shiftのみ): iter= 1 err=0.8996
- ssim/ssim-cent.emax6nc V2 -C4 -F1 (左右shiftのみ): iter= 4 err=0.5755
- ssim/ssim-cent.emax6nc V3 -C4 -F1 (左右shiftのみ): iter= 5 err=0.5401
/* MNIST-V3 epc1 epc2 epc7 C10-V3 epc1 epc2 epc7 exp */
//out_f32.exp = (out_f32.exp< 0)?0:(out_f32.exp<255)?out_f32.exp:255; out_f32.frac &= 0x7f0000; /* 16bit e=8 f=7 .0486 .0420 .0367 OK * .7886 .7180 .5349 OK 130-71 */
//out_f32.exp = (out_f32.exp< 0)?0:(out_f32.exp<255)?out_f32.exp:255; out_f32.frac &= 0x7e0000; /* 15bit e=8 f=6 * .7689 .7413 .5847 OK 130-71 */
//out_f32.exp = (out_f32.exp< 0)?0:(out_f32.exp<255)?out_f32.exp:255; out_f32.frac &= 0x7c0000; /* 14bit e=8 f=5 .8191 .7660 .6286 131-73 */
//out_f32.exp = (out_f32.exp< 0)?0:(out_f32.exp<255)?out_f32.exp:255; out_f32.frac &= 0x780000; /* 13bit e=8 f=4 .0534 .0462 .0390 OK .8208 .8358 .7129 131-70 */
//out_f32.exp = (out_f32.exp< 5)?0:(out_f32.exp<132)?out_f32.exp:132; out_f32.frac &= 0x780000; /* 12bit e=7 f=4 .0534 .0462 .0390 OK .8208 .8358 .7129 131-70 */
//out_f32.exp = (out_f32.exp< 69)?0:(out_f32.exp<132)?out_f32.exp:132; out_f32.frac &= 0x7c0000; /* 12bit e=6 f=5 .8191 .7660 .6286 131-69 */
out_f32.exp = (out_f32.exp<101)?0:(out_f32.exp<132)?out_f32.exp:132; out_f32.frac &= 0x7e0000; /* 12bit e=5 f=6 ** .7671 .7409 .5891 OK 130-72 */
//out_f32.exp = (out_f32.exp<117)?0:(out_f32.exp<132)?out_f32.exp:132; out_f32.frac &= 0x7f0000; /* 12bit e=4 f=7 .0487 .0422 .0366 OK .8257 .8223 .8952 NG 128-76 */
*o = (out_f32.exp==0)?0.0:*(float*)&out_f32;
●ssim-20201220 … 演算のみspu.cに入れ換え
emax6lib.c:OP_FMA/FMS/FML/FAD
softf8(f1.f, f2.f, f3.f, &f0.f); /* f1 + f2 * f3 -> f0 */
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.901300 *
*oをスケーリング
*o = *o * 1.18 ... 1/99途中で赤基調の怪しいパターンになった
*o = *o * 1.15 ... 1/99途中で赤基調の怪しいパターンになった
*o = *o * 1.10 ... 特徴パターンは出る err=0.98
*o = *o * 1.00 ... 特徴パターンが真っ黒 err=0.98
----
【rsim … brainシミュレータ】 2019/11/01
★rsim-20220720 … FTパッチ削除(dmrp_statをLMRING_BUSYに追加)emax6lib.c変更
★rsim-20220301 … reg_ctrlからEMAX_DEPTHを取得
★rsim-20220220 … follow ssim
★rsim-20220120 … follow ssim
★rsim-20211210 … follow ssim
●rsim-20210920 … follow ssim
●rsim-20210801 … follow ssim
●rsim-20210701 … follow ssim
●rsim-20210620 … follow ssim
●rsim-20210607 … follow ssim
●rsim-20210606 … follow ssim
●rsim-20210604 … follow ssim
●rsim-20210315 … follow ssim
●rsim-20210314 … follow ssim
●rsim-20210312 … follow ssim
●rsim-20210307 … PBL1-4:NCHIP/IMAP/OMAP=2/2/16 PBL1-5:NCHIP/IMAP/OMAP=2/2/16は×
PBL1-4:NCHIP/IMAP/OMAP=2/4/8 PBL1-5:NCHIP/IMAP/OMAP=2/4/8
- rsim/rsim-zynq.dma -t -I1 -C4 -F1 : 01 err=0.7570
●rsim-20210306 … PBL1-4:IMAP/OMAP=2/16 PBL1-5:IMAP/OMAP=2/16
IMXlen*Kが32K/4以下ならまとめてpload
●rsim-20210305 … -oオプションがerr最低時の重みをファイル出力するよう変更
imax.cでfor(ic/oc) IC/OCチェック追加
Uint cc0[][] -> Ull cc0[][]に修正.IMAP=4,OMAP=8も正常
●rsim-20210302 … conv-c2c-20210302でもPBL1-5:IMAP=4,OMAP=8は無理
PBL1-4:2+8 1-5:2+8(arch29) rsim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.7589
PBL1-4:4+8 1-5:2+8(arch28) rsim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.9000
PBL1-4:2+8 1-5:4+8(arch28) rsim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.9000
PBL1-4:4+8 1-5:4+8(arch28) rsim-zynq.emax6+dma -I1 -C4 -F1:iter=01 err=0.9000
●rsim-20210301 … AXI<->LMM単位を64bitから32bitに変更 mop(len=4Bの倍数)
●rsim-20210224 … i,o->o,i
●rsim-20210223 … ic0[ic][oc],kp[ic][oc]->ic0[oc][ic],kp[oc][ic]
●rsim-20210222 … test020がPBL1-5:IMAP=4,OMAP=8で動いたので反映.ただしrsimは収束しない
140MHz->100MHzに落しても同じ
●rsim-20201207 … PBL1-5も,IMAP=2,OMAP=8で動いた.
- rsim/rsim-cent -t -I1 -C4 -F1: miniter=95 minerr=0.2772
- rsim/rsim-cent -t -I1 -V2 -C6 -F2: miniter=85 minerr=0.2990
- rsim/rsim-cent.emax6nc -C4 -F1: miniter=73 minerr=0.2595
- rsim-zynq.emax6+dma -I1 -C4 -F1: miniter=99 minerr=0.2791
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.753500 *
CONV_BACKWARD_CNMUL1 : 1562.080sec(68.27%)
epoch 2/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.708300 *
epoch 3/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.630800 *
epoch 4/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.592000 *
epoch 98/99: cnn_eta=0.005120 fc_eta=0.010240 err=0.279200 *
epoch 99/99: cnn_eta=0.005120 fc_eta=0.010240 err=0.279100 *
●rsim-20201206 … PBL1-4は,core1()を入れ換えると,IMAX=2,OMAP=8で動いた.
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.753500 *
CONV_BACKWARD_CNMUL1 : 1724.940sec(70.41%)
●rsim-20201205 … ichの多段化途中(まだIMAP=1でしか動かない)
●rsim-20201204 … PBL1-4OMAP=16 PBL1-5OMAP=16 rsim.emax6+dma -t -I0 -C1 -F1 (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.076600 *
CONV_BACKWARD_CNMUL1: 94.900sec(40.47%)
- PBL1-4OMAP=8 PBL1-5OMAP=16 rsim.emax6+dma -t -I0 -C1 -F1 (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.070400 *
CONV_BACKWARD_CNMUL1: 122.000sec(47.07%)
★PBL1-4OMAP=8 PBL1-5OMAP= 8 rsim.emax6+dma -t -I0 -C1 -F1 (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.069900 *
CONV_BACKWARD_CNMUL1: 159.160sec(53.70.07%)
★PBL1-4OMAP=8 PBL1-5OMAP= 8 rsim.emax6+dma -t -I1 -C4 -F1 (arch29)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.752300 *
CONV_BACKWARD_CNMUL1: 1980.240sec(72.75%)
●rsim-20201203 … PBL1-4はOMAP=8,1-5はOMAP=1で実装(MNISTのOC=6が黒になるが学習は正常)
mop(top=0)の場合にDMAを抑止.emax6lib.cも変更
- rsim-cent はerr=0.7558
- rsim-cent.emax6ncはerr=0.8995
- rsim-bsd.emax6nc はerr=0.7551
●rsim-20201202 … PBL1-4をOMAP=4で実装(OMAP=8では異常がみられる)
rsim.emax6+dma -t -I1 -C4 -F1 (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.755900 *
CONV_BACKWARD_CNMUL1 : 7181.230sec(90.72%)
●rsim-20201201 … PBL1-4と1-5の高速化完了.PBL1_4_VERSION1,PBL1_5_VERSION1
MNIST+CIFAR10(arch28)epoch 1/99まで確認
MNIST epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.0698 *
CIFAR10 epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.7635 *
●rsim-20201120 … DPを利用したunaligned-load初版emax6libで再コンパイル
conv-c2c-20201120, step4000-ZCU102-20201120 WNS=-2.2 (130MHz) (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.753400 *
●rsim-20201103 … WSWAP削除版 rsim.emax6+dma -t -I1 -C4 -F1
conv-c2c-20201102, step4000-ZCU102-20201103 WNS=-1.9 (140MHz) (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.759000 *
conv-c2c-20201102, step4000-ZCU102-20201103 WNS=-2.2 (130MHz) (arch28)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.753400 *
epoch 2/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.719100 *
epoch 3/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.628100 *
epoch 4/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.561600 *
epoch 5/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.541900 *
epoch 6/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.526000 *
conv-c2c-20201102, step4000-ZCU102-20201103 WNS=-2.2 (140MHz) (arch29)
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.759100 *
epoch 2/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.721700 *
epoch 3/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.635400 *
epoch 4/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.576900 *
epoch 5/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.543200 *
●rsim-20201102 … conv-c2c-20201102/emax6.h,emax6lib.cを反映
●rsim-20201101 … spu.cをproj-arm64/fpga/step4000-ZCU102/に移動.smax,nmax削除
●rsim-20201020 … step4000-ZCU102-20201020 LMM_SIZE_IS_64K用
sgemm00(): ka=1296,1296*RMGRP*4=26KB(<32KB)なのでRMGRP=5で統一OK
●rsim-20201001 … emax6lib.c修正. emax6_kick_dma()先頭奇数/長さ2の場合seg-fault
★step4000-ZCU102-20201010 … 三精システム8lane高速版使用
rsim/rsim-zynq.emax6+dma -x -t -I1 -C4 -F1
params: WD/HT=320/320 TRAINING_MODE CIFAR10 CNN(original)MODE CNN_DEPTH=4 FC_DEPTH=1
model: conv{32 3 5 28 11 2} conv{14 11 3 14 16 2} conv{7 16 2 7 32 1} conv{7 32 2 6 32 2} fc{10}
membase: 14a79000
i_inp : 14a79000-14bf4aff
i_ker : 14bf4b00-14bf8aff
i_out : 14bf8b00-14f8f6ff
i_m0A : 14a79000-14ae97ff
i_m0B : 14ae9800-14af4bff
i_m0C : 14af4c00-14af604f
num_image=50000 width=32 height=32 nlabel=50000
finish loading 50000x3072 matrix from ../image-data/cifar-train-image, shuffle=1
num_image=10000 width=32 height=32 nlabel=10000
finish loading 10000x3072 matrix from ../image-data/cifar-test-image, shuffle=0
epoch 1/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.753900 *
TRAINING : 23657.840sec(99.73%)
TESTING : 56.430sec(0.24%)
NN_FORWARD : 614.060sec(2.59%)
CONV_FORWARD : 522.520sec(2.20%)
CONV_FORWARD_UNPACK : 0.000sec(0.00%)
CONV_FORWARD_CNMUL : 522.480sec(2.20%)
CONV_FORWARD_RESHAPE : 0.000sec(0.00%)
NN_FORWARD_RELU : 25.460sec(0.11%)
NN_FORWARD_POOLING : 23.380sec(0.10%)
NN_FORWARD_FCMUL : 15.550sec(0.07%)
NN_FORWARD_SOFTMAX : 0.120sec(0.00%)
NN_BACKWARD : 23086.710sec(97.32%)
NN_BACKWARD_FCMUL1 : 4.110sec(0.02%)
NN_BACKWARD_FCMUL2 : 3.660sec(0.02%)
NN_BACKWARD_UNPOOLING : 51.660sec(0.22%)
NN_BACKWARD_RELU : 27.820sec(0.12%)
CONV_BACKWARD : 22992.160sec(96.92%)
CONV_BACKWARD_UNPACK : 0.000sec(0.00%)
CONV_BACKWARD_RESHAPE: 0.000sec(0.00%)
CONV_BACKWARD_CNMUL1 : 22992.150sec(96.92%)
CONV_BACKWARD_CNMUL2 : 0.000sec(0.00%)
CONV_BACKWARD_PACK : 0.000sec(0.00%)
epoch 2/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.711800 *
epoch 3/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.622600 *
epoch 4/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.576900 *
epoch 5/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.534900 *
epoch 6/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.515000 *
epoch 7/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.494300 *
epoch 8/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.475600 *
epoch 9/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.445800 *
epoch 10/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.433300 *
epoch 11/99: cnn_eta=0.080000 fc_eta=0.160000 err=0.427200 *
●rsim-20200920 … rsim/rsim-zynq.emax6+dma -x -t -I0 -C1 -F1 の場合
sgemm00(): ka=1296,1296*RMGRP*4=26KB(>16KB)となりLMMに入らない
m=100を割り切れる値として,RMGRP=5から2へ変更
●rsim-20200919 … imax.cに合わせてsmax.c,nmax.cを更新.spu.c更新
●rsim-20200901 … PBL1-4,1-5 IMAX簡易版完成.高速化は今後の課題
CNN=0.08 FC=0.16 rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=01 err=0.7539
iter=02 err=0.7118
iter=03 err=0.6226
CNN=0.10 FC=0.24 rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=01 err=0.8999
CNN=0.08 FC=0.16 rsim-cent.emax6nc -t -I1 -C4 -F1 (cad110) iter=93 err=0.2731
CNN=0.08 FC=0.16 rsim-cent -t -I1 -C4 -F1 (cad110) iter=95 err=0.2772
CNN=0.08 FC=0.16 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arfs00) iter=64 err=0.2671
CNN=0.08 FC=0.16 rsim-bsd -t -I1 -C4 -F1 (arfs00) iter=78 err=0.2703
CNN=0.08 FC=0.16 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arch00) iter=04 err=0.5610
CNN=0.08 FC=0.16 rsim-bsd -t -I1 -C4 -F1 (arch00) iter=04 err=0.5484
TRAINING : 5924.094sec(99.02%)
TESTING : 32.711sec(0.55%)
NN_FORWARD : 367.852sec(6.15%)
CONV_FORWARD : 360.609sec(6.03%)
CONV_FORWARD_UNPACK : 0.000sec(0.00%)
CONV_FORWARD_CNMUL : 360.594sec(6.03%)
CONV_FORWARD_RESHAPE : 0.000sec(0.00%)
NN_FORWARD_RELU : 1.344sec(0.02%)
NN_FORWARD_POOLING : 3.180sec(0.05%)
NN_FORWARD_FCMUL : 0.820sec(0.01%)
NN_FORWARD_SOFTMAX : 0.016sec(0.00%)
NN_BACKWARD : 5587.531sec(93.39%)
NN_BACKWARD_FCMUL1 : 0.000sec(0.00%)
NN_BACKWARD_FCMUL2 : 0.008sec(0.00%)
NN_BACKWARD_UNPOOLING : 6.180sec(0.10%)
NN_BACKWARD_RELU : 1.719sec(0.03%)
CONV_BACKWARD : 5578.664sec(93.24%)
CONV_BACKWARD_UNPACK : 0.000sec(0.00%)
CONV_BACKWARD_RESHAPE: 0.000sec(0.00%)
CONV_BACKWARD_CNMUL1 : 5578.648sec(93.24%)
CONV_BACKWARD_CNMUL2 : 0.000sec(0.00%)
CONV_BACKWARD_PACK : 0.000sec(0.00%)
●rsim-20200830 … imax_conv_backwardのPAD修正,PBL1-4,1-5プロトタイピング
●rsim-20200803 … PBL1-5 oc-loopを外へ.OS毎にETAを微調整.
epoch 1/99: cnn_eta=0.100000 fc_eta=0.160000 err=0.7523
epoch 2/99: cnn_eta=0.100000 fc_eta=0.160000 err=0.7080
epoch 23/99: cnn_eta=0.032768 fc_eta=0.052429 err=0.3908
epoch 25/99: cnn_eta=0.026214 fc_eta=0.041943 err=0.3829
FC_ETA_INI=0.160 rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=59 err=0.3481
FC_ETA_INI=0.240 rsim-cent.emax6nc -t -I1 -C4 -F1 -V2 iter=52 err=0.4205
FC_ETA_INI=0.240 rsim-cent.emax6nc -t -I1 -C4 -F1 (cad112) iter=50 err=0.3338
FC_ETA_INI=0.240 rsim-cent -t -I1 -C4 -F1 (cad111) 非収束 ETA=0.08:OK
FC_ETA_INI=0.160 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arfs00) iter=91 err=0.3143
FC_ETA_INI=0.160 rsim-bsd -t -I1 -C4 -F1 (arfs00) iter=78 err=0.2703
FC_ETA_INI=0.160 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arch00) iter=01 err=0.7532
FC_ETA_INI=0.160 rsim-bsd -t -I1 -C4 -F1 (arch00) iter=01 err=0.7556
●rsim-20200802 … CNN_ETA_MAX 0.100f, CNN_ETA_MIN 0.006f
FC_ETA_INI=0.160 rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=13 err=0.4615
FC_ETA_INI=0.240 rsim-cent.emax6nc -t -I1 -C4 -F1 -V2 iter=52 err=0.4205
FC_ETA_INI=0.240 rsim-cent.emax6nc -t -I1 -C4 -F1 (cad111) iter=50 err=0.3338
FC_ETA_INI=0.240 rsim-cent -t -I1 -C4 -F1 (cad111) 非収束 ETA=0.08:OK
FC_ETA_INI=0.160 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arfs00) iter=79 err=0.3055
FC_ETA_INI=0.160 rsim-bsd -t -I1 -C4 -F1 (arfs00) iter=52 err=0.2671
FC_ETA_INI=0.160 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arch00) iter=66 err=0.3207
●rsim-20200801 … PBL-1.4のorig()からtmp_col,tmp_dst削除.
CNN_ETA_MAX 0.080f, CNN_ETA_MIN 0.005f
直接加算 rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=01 err=0.9023
直接加算 rsim-cent.emax6nc -t -I1 -C4 -F1 -V2 (cad111) =02 err=0.6520
rsim-cent.emax6nc -t -I1 -C4 -F1 (cad111) 非収束
rsim-bsd.emax6nc -t -I1 -C4 -F1 (arfs00) iter=95 err=0.3211
直接加算 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arfs00) iter=91 err=0.3143
直接加算 rsim-bsd.emax6nc -t -I1 -C4 -F1 (arch00) iter=01 err=0.7589
●rsim-20200730 … PBL-1.4のorig()OK. ETAの条件変更(CentOSもZYNQと同じ扱い)
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=09 err=0.5526
rsim-cent.emax6nc -t -I1 -C4 -F1 (cad112) iter=95 err=0.3379
rsim-cent -t -I1 -C4 -F1 (cad110) iter=61 err=0.2722
rsim-bsd.emax6nc -t -I1 -C4 -F1 (arfs00) iter=95 err=0.3211
rsim-bsd -t -I1 -C4 -F1 (arfs00) iter=78 err=0.2703
rsim-bsd -t -I1 -C4 -F1 (arch00) iter=78 err=0.2778
●rsim-20200720 … smax1nc nmax1nc 従来計算の整形のみ.正常動作確認.spu.c追加
rsim-cent.emax6nc -t -I1 -C4 -F1 (cad112) iter=96 err=0.2725
●rsim-20200716 … rsim-bsdがNG.#if defined(ARMZYNQ) FC_ETA_MAX 0.320f FC_ETA_MIN 0.020f
#else FC_ETA_MAX 0.160f FC_ETA_MIN 0.010f
●rsim-20200710 … smax.c nmax.c initial templates
●rsim-20200703 … imax_sgemm00()N(12)のn(10)制限完成(i_m0Cの縮小済)
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=23 err=0.3605
rsim-cent.emax6nc -t -I1 -C4 -F1 (cad109) iter=81 err=0.2727
●rsim-20200702 … imax_sgemm00()N(12)のn(10)制限(CEXE+CST)を導入(conv-c2c修正)
i_m0Cの縮小は未対応
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=11 err=0.4475
rsim-cent.emax6nc -t -I1 -C4 -F1 (cad109) iter=81 err=0.2727
●rsim-20200701 … imax_sgemm00()暫定
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=21 err=0.3713
rsim-cent.emax6nc -t -I1 -C4 -F1 (cad110) iter=81 err=0.2727
rsim/rsim -t -I1 -V3 -C3 -F1 (cad111) iter=44 err=0.3119
rsim/rsim -t -I1 -V3 -C4 -F1 (cad109) iter=57 err=0.3101
rsim/rsim -t -I1 -V3 -C5 -F1 (cad112) iter=63 err=0.2962
rsim/rsim -t -I1 -V3 -C6 -F1 (cad112) iter=56 err=0.3147
rsim/rsim -t -I1 -V2 -C3 -F1 (cad111) NG
rsim/rsim -t -I1 -V2 -C4 -F1 (cad109) iter=66 err=0.3078
rsim/rsim -t -I1 -V2 -C5 -F1 (cad112) iter=58 err=0.2920
rsim/rsim -t -I1 -V2 -C6 -F1 (cad112) NG
rsim/rsim -t -I1 -V2 -C6 -F2 (cad112) iter=65 err=0.2991
乗算誤差?fc_etaをcnn_etaの4倍に増加 OK
乗算誤差?fc_etaをcnn_etaと同じに削減 NG
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) NG 0.90
rsim.emax6+dma IMAX+i_m0C書戻し無 OK
●rsim-20200630 … imax.cに,imax_conv_bbackward(),imax_sgemm()を集約
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) OK
●rsim-20200625 … eta/wd見直し
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=66 err=0.2772
rsim/rsim.emax6nc -t -I1 -C4 -F1 (cad111) iter=91 err=0.2729
rsim/rsim -t -I1 -C4 -F1 (cad111) iter=95 err=0.2772
●rsim-20200613 … 2x2のIMAX2重ループにimgを編入.in/out連続アドレス化
rsim.emax6+dma -t -I0 -C1 -F1 (arch29) iter=98 err=0.0127
rsim.emax6+dma -t -I1 -C4 -F1 (arch28) iter=79 err=0.2741
rsim/rsim.emax6nc -t -I1 -C4 -F1 (cad110) iter=99 err=0.2767
●rsim-20200611 … 2x2のIMAX2重ループにimg編入.emax6nc正常.ただしoutの連続化要
●rsim-20200610 … imax_cpyin()にorder追加(batch,oc / oc,batch切替え)
rsim/rsim.emax6nc -t -I1 -C4 -F1 (centos) iter=80 err=0.2724
●rsim-20200608 … 2x2のループ交換(burst_exec長確保のため,rofsを外に,imgを中に
rsim/rsim.emax6nc -t -I1 -C4 -F1 (centos) iter=80 err=0.2724
●rsim-20200601 … mop(force)に変数を指定(conv-c2c-20200601から可能)
rsim.emax6+dma -t -I1 -C4 -F1 (zcu102) iter=94 err=0.2842
●rsim-20200531 … IMAXで正常(force=1固定の暫定版)
rsim.emax6+dma -t -I0 -C1 -F1 (zcu102) iter=98 err=0.0127
●rsim-20200525 … IMAXと接続準備中(OCが4の倍数でない場合に対応)
●rsim-20200516 … IMAXと接続準備中(imax_conv_forward完成)
rsim/rsim.emax6nc -t -Ii -C4 -F1 (centos) iter=98 err=0.2720
rsim/rsim -t -Ii -C4 -F1 (centos) iter=93 err=0.2737
●rsim-20200420 … IMAXと接続準備中
rsim/rsim.emax6nc -t -Ii -C4 -F1 (centos) iter=67 err=0.2661
rsim/rsim -t -Ii -C4 -F1 (centos) iter=82 err=0.2680
rsim/rsim.emax6nc -t -Ii -C4 -F1 (zcu102) iter=42 err=0.2725
rsim/rsim -t -Ii -C4 -F1 (zcu102) iter=52 err=0.2675
●rsim-20200418 … IMAXと接続準備中
rsim/rsim.emax6nc -t -Ii -C4 -F1 (centos) iter=95 err=0.2767
●rsim-20200414 … IMAXと接続準備中 Dllをlong doubleからUll[2]に変更
●rsim-20200409 … IMAXと接続準備中 imax_conv_forward()のreferenceをIMAX向け改良
●rsim-20200408 … IMAXと接続準備中 imax_conv_forward()が疑似コードで動作
rsim/rsim -t -Ii -C4 -F1 (centos) iter=93 err=0.2737
zcu102の場合,gcc -O2がfmaddを生成し収束しない. gcc -O1は収束
●rsim-20200407 … IMAXと接続準備中 imax_conv_forward()が疑似コードで動作
rsim/rsim -t -I0 -x 正常に学習/認識
●rsim-20200211 … UPDATE_WD1/2/3にて徐々に微調整
rsim/rsim -t -I1 -C4 -F1 (centos) iter=82 err=0.2680
rsim/rsim -t -I1 -V2 -C4 -F1 (centos) iter=81 err=0.3047
●rsim-20200210 … CIFARでは,入力画像をflipしたものも学習に使用
rsim/rsim -t -I1 -C4 -F1 (centos) iter=85 err=0.2880
rsim/rsim -t -I1 -V2 -C4 -F1 (centos) iter=95 err=0.3160
●rsim-20200118 … etaを動的調整(errが増加した場合に0.8を乗じる.eta<0.01で元に戻す)
●rsim-20200117 … FC層をマルチ化完了 rsim/rsim -x -t -I0 -V3 -C1 -F2 (round 20/80: test-err=0.040400)
rsim/rsim -x -t -I0 -V3 -C1 -F2 は精度が悪い
●rsim-20200116 … FC層をマルチ化途中
●rsim-20200115 … CIFAR10を"32x32&color"のまま入力可能,MNISTとCIFAR10のモデルを分離
●rsim-20200114 … extract_slit()に加え,Ipl2F4h()においてchannel[8]に元画像を追加して精度向上
●rsim-20200113 … -S<num> SLIT_TYPE指定機能追加 (MNISTは-S3 CIFARは-S2が高認識率)
●rsim-20200112 … copy_H_to_RGB()追加.最終隠れ層も表示
●rsim-20200111 … enable_pad追加 printf("(height - psize)/pstride + 1 == oheight || height == oheight\n");
●rsim-20200110 … iter=100に増加
【有望な組み合わせに絞った探索】 CIFAR10 cnn eye iter=40の値
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,32,1} *0.4351 *0.4905 ★★32-2-32-1
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,32,2} *0.4321 *0.5005 ★★32-2-32-2
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,32,2},{ 2,32,2, 1, 64,1} *0.4383 *0.5095 ★★32-2-32-2-64-1
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,1} *0.4590 *0.5120 ★★32-2-64-1
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,2} *0.4088 *0.5009 ★★32-2-64-2
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,2},{ 2,64,2, 1,128,1} 0.8999 *0.5357 ★ 32-2-64-2-128
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,64,1} 0.8999 *0.5096 ★ 64-2-64-1
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,64,2} *0.4148 *0.5099 ★★64-2-64-2
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,64,2},{ 2,64,2, 1,128,1} *0.8999 *0.8999
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,128,1} 0.9003 0.9003
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,128,2} 0.9003 0.5120 ★ 64-2-128-2
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,256,1} 0.9003 0.9003
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,256,2} 0.9003 0.9003
●rsim-20200102 … -D CNN_DEPTH を追加
【以下はtrain=50000を使用したtrainingの場合】 CIFAR10 cnn eye iter=40の値
{28,1,5,24,8,1},{24,8,3,22,24,1} 0.9001 0.6192
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9001 *0.6015
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9001 *0.5803
{28,1,5,24,8,1},{24,8,3,22,32,1},{22,32,3,20,64,1} 0.9001 0.8995
{28,1,5,24,8,1},{24,8,3,22,32,1},{22,32,3,20,64,1},{20,64,3,18,128,1} 0.9001 0.9001
{28,1,5,24,8,1},{24,8,3,22,32,1},{22,32,3,20,64,2} 0.8996 0.8996
{28,1,5,24,8,1},{24,8,3,22,32,1},{22,32,3,20,64,2} 0.8996 0.8996
{28,1,5,24,8,1},{24,8,3,22,16,1},{22,16,3,20,32,2},{10,32,3, 8, 64,1},{8,64,3,6,128,1},{6,128,3,4,128,2},{2,128,1,2,128,1},{2,128,1,2,128,1},{2,128,1,2,128,2}
0 9000 0 9000
{28,1,5,24,8,1},{24,8,3,22,64,1},{22,64,3,20,64,2},{10,64,3, 8,256,1},{8,256,3,6,256,2}
0 8999 0 8999
{28,1,5,24,8,1},{24,8,3,22,32,2} *0.4745 *0.5576 ★32
{28,1,5,24,8,1},{24,8,3,22,64,2} *0.4578 *0.5820 ★64
{28,1,5,24,8,1},{24,8,3,22,32,2},{11,32,3, 9,64,1},{ 9,64,3, 7,128,2} 0.9000 0.9000
{28,1,5,24,8,2},{12,8,3,10,32,1} *0.4713 *0.5437 ★32
{28,1,5,24,8,2},{12,8,3,10,64,1} *0.5307 *0.5686 ★64
{28,1,5,24,8,2},{12,8,3,10,32,1},{10,32,2, 9,64,1},{ 9,64,2, 8,128,2} 0.8991 0.9002
{28,1,5,24,8,2},{12,8,3,10,32,1},{10,32,3, 8,64,2},{ 4,64,3, 2,128,1} 0.8991 0.8991
{28,1,5,24,8,2},{12,8,3,10,64,1},{10,32,2, 9,64,1},{ 9,64,2, 8,128,2} 0.8999 0.8999
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,1} *0.4590 *0.5120 ★★32-2-64-1
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,64,1} 0.8998 *0.5096 ★ 64-2-64-1
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,128,1} 0.8999 0.8999
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,3, 3,64,1} *0.4703 0.9000
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,3, 3,64,1},{ 3,64,3, 1,128,1} 0.9000 0.9000
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,2} *0.4088 *0.5009 ★★32-2-64-2
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,3, 3,64,2} 0.9002 *0.5869 32-3-64-2
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,64,2} *0.4148 *0.5099 ★★64-2-64-2
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,64,2},{ 2,64,2, 1,256,1} 0.9000 0.9000
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,2},{ 2,64,2, 1,128,1} 0.8999 *0.5357 32-2-64-2-128
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,128,2} 0.9002 *0.5120 ★ 64-2-128-2
{28,1,5,24,8,2},{12,8,3,10,64,2},{ 5,64,2, 4,256,2} 0.8999 0.8999
{28,1,5,24,8,2},{12,8,3,10,128,2},{5,128,2,4,128,2} 0.8999 0.8999
{28,1,5,24,8,2},{12,8,5, 8,64,2},{ 4,64,2, 2,128,2} 0.5407 0.5120 ★★64-2-128-2
【以下はtrain+test=60000を使用したtrainingの場合】 MNIST cnn eye CIFAR10 cnn eye ★★がdefault-model
{28,1,5,24,8,1} *0.0169 0.0357 0.9000 0.5625 1bit-4SAD(TH=3)
{28,1,5,24,8,1} *0.0170 *0.0358 0.9002 *0.4224 laplacian(TH=13)iter20
{28,1,5,24,8,1},{24,8,3,22,16,1} *0.0150 *0.0231 0.8999 *0.2809 laplacian(TH=13)iter20 0.2490(iter30) 0.2346(iter40) 0.2259(iter100)★
{28,1,5,24,8,1},{24,8,3,22,20,1} 0.9013 *0.0231 0.9002 *0.2459 laplacian(TH=13)iter20 0.2079(iter30) 0.1893(iter40) 0.1454(iter100)★
{28,1,5,24,8,1},{24,8,3,22,24,1} 0.9013 *0.0230 0.9001 *0.1945 laplacian(TH=13)iter20 0.1546(iter30) 0.1268(iter40) 0.0012(iter100)★★
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 0.0272 0.9001 0.3170 1bit-4SAD(TH=3)
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 0.0248 0.9001 0.1681 laplacian(TH=10)
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 *0.0246 0.9001 0.1617 laplacian(TH=12)
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 *0.0250 0.9001 *0.1531 laplacian(TH=13)iter20 0.0982(iter30) 0.0269(iter40)★
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 0.0264 0.9001 *0.1456 laplacian(TH=14)
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 0.0251 0.9001 0.1575 laplacian(TH=16)
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 0.0255 0.9001 0.1718 laplacian(TH=20)
{28,1,5,24,8,1},{24,8,3,22,32,1} 0.9013 0.0254 0.9001 0.2289 laplacian(TH=32)
{28,1,5,24,8,1},{24,8,3,22,32,1},{22,32,3,20,32,1} 0.9013 0.9013 0.9003 0.9001 laplacian(TH=13)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0284 0.8999 0.2408 1bit-4SAD(TH=3)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0248 0.8999 0.0843 laplacian(TH=10)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0245 0.8999 0.0837 laplacian(TH=12)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 *0.0243 0.8999 *0.0808 laplacian(TH=13)iter20 0.0110(iter30)★
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0249 0.8999 0.1419 laplacian(TH=14)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0250 0.8999 0.0899 laplacian(TH=16)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0258 0.8999 0.0912 laplacian(TH=20)
{28,1,5,24,8,1},{24,8,3,22,64,1} 0.9013 0.0277 0.8999 0.1321 laplacian(TH=32)
{28,1,5,24,8,1},{24,8,3,22,64,1},{22,64,3,20,64,1} 0.9013 0.9013 0.8995 0.9004 laplacian(TH=13)
{28,1,5,24,8,1},{24,8,3,22,128,1} 0.9013 0.0229 0.9003 0.0702 laplacian(TH=13)iter20 0.0001(iter30)★
{28,1,5,24,8,1},{24,8,3,22,32,2} 0.0152 0.0203 0.3785 0.3644 laplacian(TH=13)iter20 0.3443(iter40)★
{28,1,5,24,8,2} 0.0176 0.0330 0.5010 0.6491 1bit-4SAD(TH=3)
{28,1,5,24,8,2} 0.0176 0.0402 0.4951 0.5295 laplacian(TH=13)
{28,1,5,24,8,2},{12,8,3,10,32,1} 0.0130 0.0181 0.3892 0.3798 laplacian(TH=13)
{28,1,5,24,8,2},{12,8,3,10,64,1} 0.9013 0.0174 0.3695 0.2915 laplacian(TH=13)
{28,1,5,24,8,2},{12,8,3,10,32,2} 0.0128 0.0186 0.3505 0.4247 laplacian(TH=13)iter40
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,2} 0.0092 0.0188 0.3478 0.4083 laplacian(TH=13)iter40
{28,1,5,24,8,2},{12,8,3,10,32,2},{ 5,32,2, 4,64,2},{2,64,2,1,128,1} 0.9000 0.0165 0.9000 0.3246 laplacian(TH=13)iter40
●rsim-20200101 … Option-x, hist_flat() 追加.8b-edge有効はCNN_DEPTH=2まで
●rsim-20191231 … cnnet.c:p[CNN_DEPTH]={{},{}}に統合
●rsim-20191230 … 5x5 - bias - relu - pool - 8x3x3 - bias - relu - pool - fc
●rsim-20191221 … 5x5 - bias - relu - pool - fc (poolingを11x11から12x12に変更)
●rsim-20191220 … nchannel=20;cifar10対応と精度向上方法探索中(nchannel=20)
●rsim-20191219 … mnist最終版
●rsim-20191218 … motion_xy/zの比較をslit完全一致から部分一致に変更
---
【mplayer … movie->RGB24変換】
FreeBSD7.2(32bit) arch00
FreeBSD12.0(32bit) arch09,10,11,12,13,14,15
CentOS7.7(64bit) arch07 cad107 wonder47
----
【capture使い方】
●rsim/mplayerの標準入力向けにmcamera画像を生成する
make -f Makefile-bktr all clean .. capture-bktr: FreeBSD+/dev/bktr01用(arch09)
make -f Makefile-v4l2 all clean .. capture-v4l2: FreeBSD/CentOS+/dev/video0用(arch07/arch09)
----
【コマンド例】
### FreeBSD/CentOS
### arch00,07,09,10-15/cad107,wonder47
mplayer/mplayer image-data/svideo1.mp4 >/dev/null
mplayer/mplayer image-data/svideo2.mp4 >/dev/null
mplayer/mplayer image-data/drive.mp4 >/dev/null
mplayer/mplayer image-data/small.avi >/dev/null
mplayer/mplayer image-data/svideo1.mp4 | rsim/rsim -2 -w640 -h480 -x -r -I0 -V3 -C1 -F1
mplayer/mplayer -<image-data/svideo2.mp4 | rsim/rsim -2 -w640 -h480 -x -r -I0 -V3 -C1 -F1
mplayer/mplayer -<image-data/drive.mp4 | rsim/rsim -2 -w640 -h720 -x -r -I0 -V3 -C1 -F1
### FreeBSD+/dev/bktr01 (stereo camera)
### arch09,12,13
ssh arch09 proj-arm64/sample/capture/capture-bktr | mplayer/mplayer - -demuxer rawvideo -rawvideo w=640:h=240:format=rgb24 | rsim/rsim -2 -w320 -h240 -x -r -I0 -V3 -C1 -F1
### FreeBSD/CentOS+/dev/video0 (single camera)
### arch09,10,11,14,15/cad107,wonder47
ssh arch09 proj-arm64/sample/capture/capture-v4l2 | mplayer/mplayer - -demuxer rawvideo -rawvideo w=320:h=240:format=rgb24 | rsim/rsim -w320 -h240 -x -r -I0 -V3 -C1 -F1
### FreeBSD/CentOS training MNIST
rsim/rsim -t -I0
### FreeBSD/CentOS VBGMM Image segmentation
mplayer/mplayer image-data/small.avi | ImageSegment/IS_VBGMM -w320 -h240 -x -c10
mplayer/mplayer image-data/small.avi | ImageSegment/IS_VBGMM -w320 -h240 -x -c3
mplayer/mplayer image-data/small.avi | ImageSegment/IS_VBGMM -w320 -h240 -x -c16
mplayer/mplayer image-data/svideo1.mp4 | ImageSegment/IS_VBGMM -w1280 -h480 -x -c10 -r
### cad101
mplayer/mplayer image-data/Milan_crosswalk_small.mp4 | ImageSegment_VB_GPU/vbgmm -w640 -h360 -x -c5 -a3 -u -t0.5
mplayer/mplayer image-data/Milan_crosswalk.mp4 | ImageSegment_VB_GPU/vbgmm -w1280 -h720 -x -c5 -a3 -u -t0.5
---
【旧psim … NNシミュレータとASICEDB6】 2019/10/01
●psim.c.20191001 … nsim.c.20150814をASICEDB6用に改造
----
【旧nsim … NNシミュレータとASICEDB4】 2015/7/22
●nsim.c.20150814 … シナプスモデルを木村・亀田版からマージ
●nsim.c.20150811 … FILMインタフェースを160bitに拡張
●nsim.c.20150730 … 入力パターン太字(入力0,1,2,3,4に対し,0,1,4,4,4と認識)
●nsim.c.20150729 … アルゴリズム修正(入力0,1,2,3,4に対し,0,2,3,3,4と認識)
●nsim.c.20150728 … 入力パターンも縮小
●nsim.c.20150727 … 出力位置を2x5に変形
●nsim.c.20150726 … NWIDTHを半分に縮小
●nsim.c.20150725 … 数字を変える度にneuronを初期化
●nsim.c.20150724 … 認識時に常に「3」と認識
----
【MISC】
BayesianGMM_CUDA 西本VBGMM/GPU(修論)
BayesianGMM_HLS 西本VBGMM/CPU(修論)
ImageSegment 西本VBGMM画像セグメンテーション
----
【NCHIP評価対象プログラム】
・filter+rmm.c
・mm_cnn_lf/{mm,cnn,gather,gdepth}+rmm.c
----
【csim対応プログラム一覧】 ○完走,●作業中
┌────┬────┬────┰───┬───┬───────┰───────┬───────┬───────┬───────┬───────┰───┬─┬─┐
-DPTHREAD │ │○ │ ┃ │ │ ┃ │○ │ │ │ ┃ │ │ │
-DARMSIML │ │ │ ┃ │ │ ┃○ │○ │○ │○ │○ ┃ │ │ │※SIML+mmap(emax_start()->svc)
-DARMZYNQ │ │ │ ┃ │ │ ┃ │ │ │ │ ┃○ │○│○│※ZYNQ+mmap
-DEMAX6 │ │ │○ ┃ │ │ ┃ │ │○ │○ │○ ┃○ │○│○│conv://EMAX5A(emax6_start())
└────┴────┴────┸───┴───┴───────┸───────┴───────┴───────┴───────┴───────┸───┴─┴─┘
│Intel │.pth │.emax6nc│GPU │MIC │armv7 │armv8 │.pth(8) │.emax6nc │.emax6 │emax6+dma zynq.emax6nc│zynq.emax6(asic)
┌────┬────┬────┬───┬───┬───────┬───────┬───────┬───────┬───────┬───────┐ zynq.emax6+dma(asic)
stan-Bubble │○ │- │- │- │- │ d8514(1356be)│1164e2(12093d)│- │- │- │- │
stan-FFT │○ │- │- │- │- │ 52a10( 58d99)│ af4e( 17244)│- │- │- │- │
stan-Intmm │○ │- │- │- │- │ 6f6fb( 74583)│ 23bde( 3b114)│- │- │- │- │
stan-Mm │○ │- │- │- │- │ 922aa( b9964)│ 2884a( 426bb)│- │- │- │- │
stan-Perm │○ │- │- │- │- │ e1c56(1773da)│136569(18faeb)│- │- │- │- │
stan-Puzzle │○ │- │- │- │- │58f746(5a4229)│4fc440(54c470)│- │- │- │- │
stan-Queens │○ │- │- │- │- │ 4514( 68e0)│ 40a7( ba18)│- │- │- │- │
stan-Quick │○ │- │- │- │- │ b8543( c5e55)│ c0f0d( e6cf2)│- │- │- │- │
stan-Towers │○ │- │- │- │- │11c414(1c9ea4)│10d037(1458c6)│- │- │- │- │
stan-Trees │○ │- │- │- │- │11e471(198425)│112884(315a04)│- │- │- │- │L2miss(DDR3)を50nsと仮定
4dimage/gdisp │○ │- │- │- │- │- │- │- │- │- │- │EMAX :1.0GHzを仮定
4dimage/gather│gather │.pth │.emax6nc│ │ │ │-arm │-4core/8thread│-arm.emax6nc │-arm.emax6 │-arm.emax6.dma│ARMv8:1.0GHz MM遅延=L2DELAY=50
L2delay=50 │○ │○ │○ │- │- │- │cycl=0d4ded56 │cycl=035dff0e │cycl=2c35a87f │cycl=0eca79b1 │cycl=00a7a426 │bsimのcycle数のちょうど1/2.5
4dimage/gdepth│gdepth │.pth │.emax6nc│ │ │ │-arm │-4core/8thread│-arm.emax6nc │-arm.emax6 │-arm.emax6.dma│ ARM=1.0GHz,IMAX=1.0GHzと解釈すればbsimと同等
L2delay=50 │○ │○ │○ │- │- │- │cycl=6e9d9ac5 │cycl=1c1ca380 │cycl=9e25ff3d │cycl=1d52c0e8 │cycl=0155db7c │bsimのcycle数のちょうど1/2.16
conv16 │○ │- │○ │- │- │- │cycl=0243c580 │- │cycl=0576a9c0 │cycl=00a242b1 │cycl=00780839 │
filter │○ │- │○ │- │- │- │○ │- │○ │○ │○ │X<<24|Y<<16に対し,-O3が非アライン16byte-ld命令
stencil-pipe │○ │- │未 │- │- │- │○ │- │未 │未 │未 │ -O2ではOK
├────┼────┼────┼───┼───┼───────┼───────┼───────┼───────┼───────┼───────┤
│ │ │ │ │ │ │ │ │ │LDDMQ_MUX=8 │LDDMQ_MUX=8 │
tricount8 │○ │○ │○ │○ │○ │- │cy=003dbdb1 │BK=8/16はhang │cy=004458b7 │cy=■■■未実 │cy=■■■未実 │MAXL2BK=8, 8段run.small
│○ │○ │○ │○ │○ │- │cy=893a328e6 │cy=33ab17506 │cy=8f2f654d7 │cy=■■■未実 │cy=■■■未実 │MAXL2BK=8, 8段run
dijkstra3 │○ │○ │- │○ │○ │- │○ │○ │- │- │- │
Kdijkstra │○ │○ │- │○ │○ │- │○ │○ │- │- │- │
---- └────┴────┴────┴───┴───┴───────┴───────┴───────┴───────┴───────┴───────┘
【bsim対応プログラム一覧】 ○完走,●作業中
┌────┬────┬────┰───┬───┬───────┰───────┬───────┬───────┬───────┬───────┰───┬─┬─┐
-DPTHREAD │ │○ │ ┃ │ │ ┃ │○ │ │ │ ┃ │ │ │
-DARMSIML │ │ │ ┃ │ │ ┃○ │○ │○ │○ │ ┃ │ │ │※SIML+mmap(emax_start()->svc)
-DARMZYNQ │ │ │ ┃ │ │ ┃ │ │ │ │ ┃○ │○│ │※ZYNQ+mmap
-DEMAX5 │ │ │○ ┃ │ │ ┃ │ │○ │○ │ ┃○ │○│ │conv://EMAX5A(emax5_start())
└────┴────┴────┸───┴───┴───────┸───────┴───────┴───────┴───────┴───────┸───┴─┴─┘
│Intel │.pth │.emax5nc│GPU │MIC │armv7 │armv8 │.pth(8) │.emax5nc │.emax5 │ zynq.emax6nc│zynq.emax6(asic)
┌────┬────┬────┬───┬───┬───────┬───────┬───────┬───────┬───────┬───────┐
stan-Bubble │○ │- │- │- │- │ d8514(1356be)│1164e2(11a6c1)│- │- │- │ │
stan-FFT │○ │- │- │- │- │ 52a10( 58d99)│ af4e( ec71)│- │- │- │ │
stan-Intmm │○ │- │- │- │- │ 6f6fb( 74583)│ 23bde( 2cf7c)│- │- │- │ │
stan-Mm │○ │- │- │- │- │ 922aa( b9964)│ 2884a( 34639)│- │- │- │ │
stan-Perm │○ │- │- │- │- │ e1c56(1773da)│136569(1891f3)│- │- │- │ │
stan-Puzzle │○ │- │- │- │- │58f746(5a4229)│4fc440(50fe91)│- │- │- │ │
stan-Queens │○ │- │- │- │- │ 4514( 68e0)│ 40a7( 6328)│- │- │- │ │
stan-Quick │○ │- │- │- │- │ b8543( c5e55)│ c0f0d( d0a9a)│- │- │- │ │
stan-Towers │○ │- │- │- │- │11c414(1c9ea4)│10d037(13efb4)│- │- │- │ │
stan-Trees │○ │- │- │- │- │11e471(198425)│112884(1b9548)│- │- │- │ │L2miss(DDR3)を80nsと仮定
4dimage/gdisp │○ │- │- │- │- │- │- │- │- │- │ │EMAX :240MHz MM遅延=L2DELAY/10=20
4dimage/gather│gather │.pth │.emax5nc│ │ │ │-arm │-2core/8thread│-arm.emax5nc │-arm.emax5 │ │ARMv8:2.4GHz MM遅延=L2DELAY =200
L2delay=200│○ │○ │○ │- │- │- │cycl=21fbd6f5 │cycl=10c9dcb7 │cycl=3b23cec6 │cycl=01ad8060 │ │armv8の20.3倍速
4dimage/gdepth│gdepth │.pth │.emax5nc│ │ │ │-arm │-2core/8thread│-arm.emax5nc │-arm.emax5 │ │
L2delay=200│○ │○ │○ │- │- │- │cycl=7c58448f │cycl=49a74af3 │cycl=ac79be75 │cycl=02e0ff72 │ │armv8の43.2倍速
conv16 │○ │- │○ │- │- │- │cycl=02408f32 │- │cycl=057e8e90 │cycl=0036d3a3 │ │armv8の10.0倍速
filter │○ │- │○ │- │- │- │○ │- │○ │○ │ │X<<24|Y<<16に対し,-O3が非アライン16byte-ld命令
stencil-pipe │○ │- │未 │- │- │- │○ │- │未 │未 │ │ -O2ではOK
├────┼────┼────┼───┼───┼───────┼───────┼───────┼───────┼───────┼───────┤
│ │ │ │ │ │ │ │ │ │LDDMQ_MUX=8 │ │
tricount8 │○ │○ │○ │○ │○ │- │cy=002ed584 │○ │cy=0034e7f5 │cy=006c34d6 │ │MAXL2BK=64, 8段run.small
│○ │○ │○ │○ │○ │- │cy=8_93a319cb │cy=3_37d0f00c │cy=8_f73f2765 │cy=3_93c54b01 │ │MAXL2BK=8, 8段run
dijkstra3 │○ │○ │- │○ │○ │- │○ │○ │- │- │ │
Kdijkstra │○ │○ │- │○ │○ │- │○ │○ │- │- │ │
---- └────┴────┴────┴───┴───┴───────┴───────┴───────┴───────┴───────┴───────┘
●bsim-20160521
4dimage/gdepth-arm.emax5
000:SVC 00000000_4c14cfb9 EMAX5 start conf=00000000_00015c40 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_6e748320) EMAX_TERM(00000001_6e752e2e) c0 s0 i39 d203 l3221 r87 s1 e827 z1 t0 total=4379
000:SVC 00000000_4c14d177 EMAX5 start conf=00000000_00015c40 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_6e753ed2) EMAX_TERM(00000001_6e75e9e0) c0 s0 i39 d203 l3221 r87 s1 e827 z1 t0 total=4379
000:SVC 00000000_4c14d335 EMAX5 start conf=00000000_00015c40 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_6e75fa84) EMAX_TERM(00000001_6e7638d2) c0 s0 i39 d203 l437 r87 s1 e827 z1 t0 total=1595
000:SVC 00000000_4c14d4f3 EMAX5 start conf=00000000_00015c40 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_6e764976) EMAX_TERM(00000001_6e76f484) c0 s0 i39 d203 l3221 r87 s1 e827 z1 t0 total=4379
000:SVC 00000000_4c14d501 EMAX5 drain_dirty_lmm EMAX_START(00000001_6e76f81c) EMAX_TERM(00000001_6e770014) c0 s0 i0 d203 l0 r0 s0 e0 z1 t0 total=204
000:PA 00000000_4c14d511 siml_emax5: tinit=00000000_00000000 conf=00000000_00000000 scon=00000000_00000000 lmmi=00000000_0000abd8 drain=00000000_00037e7f load=00000000_002e6f08
regv=00000000_00017f58 start=00000000_00000468 exec=00000000_000e3bf8 term=00000000_00000469 trans=00000000_00000000
EMAX240MHz=00000000_00425d80(icidlrett) ARM2400MHz=00000000_02e0ff72
000:step=00000000_0007ad65 cycle=00000000_02e0ff72 i1(100.0%)wait=00000000_00000000 d1( 67.1% hit=00000000_000287d1 mis=00000000_00013d49)wait=00000000_00154348 l2( 90.3% hit=00000000_00011e6d mis=00000000_00001edc) g2( 0.0% hit=00000000_00000000 mis=00000000_00001edc) flush(L1->00000000_001a76cccycle, L2->00000000_00000020cycle)
4dimage/gdepth-arm.emax5nc
000:step=00000000_91915ef8 cycle=00000000_ac79be75 i1(100.0%)wait=00000000_00000000 d1( 98.0% hit=00000000_0aa60a6b mis=00000000_003663f2)wait=00000000_04072778 l2( 67.5% hit=00000000_0024bb0d mis=00000000_0011a8e8) g2( 0.0% hit=00000000_00000000 mis=00000000_00120b1e) flush(L1->00000000_00000000cycle, L2->00000000_00000000cycle)
4dimage/gdepth-arm
000:step=00000000_62c99be9 cycle=00000000_7c58448f i1(100.0%)wait=00000000_00000000 d1( 98.9% hit=00000000_1259d090 mis=00000000_00365ef8)wait=00000000_0203315b l2( 67.6% hit=00000000_0024c4c1 mis=00000000_00119a37) g2( 0.0% hit=00000000_00000000 mis=00000000_0011f88d) flush(L1->00000000_00000000cycle, L2->00000000_00000000cycle)
4dimage/gather-arm.emax5
000:SVC 00000000_4bb16146 EMAX5 start conf=00000000_00015980 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_44f539ae) EMAX_TERM(00000001_44f56e60) c0 s0 i39 d390 l4 r87 s1 e827 z1 t0 total=1349
000:SVC 00000000_4bb16256 EMAX5 start conf=00000000_00015980 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_44f57b62) EMAX_TERM(00000001_44f5d454) c0 s0 i39 d390 l932 r87 s1 e827 z1 t0 total=2277
000:SVC 00000000_4bb16366 EMAX5 start conf=00000000_00015980 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_44f5e156) EMAX_TERM(00000001_44f63a48) c0 s0 i39 d390 l932 r87 s1 e827 z1 t0 total=2277
000:SVC 00000000_4bb16476 EMAX5 start conf=00000000_00015980 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_44f6474a) EMAX_TERM(00000001_44f6a03c) c0 s0 i39 d390 l932 r87 s1 e827 z1 t0 total=2277
000:SVC 00000000_4bb16586 EMAX5 start conf=00000000_00015980 lmmi=00000000_4ffff600 regv=00000000_4fffd600 EMAX_START(00000001_44f6ad3e) EMAX_TERM(00000001_44f70630) c0 s0 i39 d390 l932 r87 s1 e827 z1 t0 total=2277
000:SVC 00000000_4bb16594 EMAX5 drain_dirty_lmm EMAX_START(00000001_44f709c8) EMAX_TERM(00000001_44f7190e) c0 s0 i0 d390 l0 r0 s0 e0 z1 t0 total=391
000:PA 00000000_4bb1659f siml_emax5: tinit=00000000_00000000 conf=00000000_00000037 scon=00000000_00000000 lmmi=00000000_0000abd8 drain=00000000_0006b674 load=00000000_000e0b80
regv=00000000_00017f58 start=00000000_00000468 exec=00000000_000e3bf8 term=00000000_00000469 trans=00000000_00000000
EMAX240MHz=00000000_00253224(icidlrett) ARM2400MHz=00000000_01ad8060
000:step=00000000_0004aeb5 cycle=00000000_01ad8060 i1(100.0%)wait=00000000_00000161 d1( 65.4% hit=00000000_000203a2 mis=00000000_00011133)wait=00000000_0011186a l2( 88.6% hit=00000000_0000f22a mis=00000000_00001f1d) g2( 0.0% hit=00000000_00000000 mis=00000000_00001f1d) flush(L1->00000000_0017b5c9cycle, L2->00000000_00000028cycle)
4dimage/gather-arm.emax5nc
000:step=00000000_295b38a2 cycle=00000000_3b23cec6 i1(100.0%)wait=00000000_00000000 d1( 95.4% hit=00000000_0192436f mis=00000000_00136dc8)wait=00000000_0267bd1a l2( 12.1% hit=00000000_00025c2e mis=00000000_001111d5) g2( 0.0% hit=00000000_00000000 mis=00000000_0012bc44) flush(L1->00000000_00000000cycle, L2->00000000_00000000cycle)
4dimage/gather-arm
000:step=00000000_09a8fdca cycle=00000000_21fbd6f5 i1(100.0%)wait=00000000_00000000 d1( 94.7% hit=00000000_015d1ab7 mis=00000000_001359fd)wait=00000000_0136cb21 l2( 11.9% hit=00000000_00024d87 mis=00000000_00110c85) g2( 0.0% hit=00000000_00000000 mis=00000000_0012b140) flush(L1->00000000_00000000cycle, L2->00000000_00000000cycle)
●bsim-20160520 … DRAIN/LOADの候補までを粗くskipする機構を追加,EMAX5起動オーバヘッド見直し(ACP接続ではstack領域のL2-flush不要)
../../src/bsim/bsim -x test007-arm.emax5|jl(mapdist=1,lmp,lmd)
000:SVC 00000000_000aaa88 EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 s0 i39 d8 l194 r87 s1 e221 z1 t0 total=606
000:SVC 00000000_000aab5a EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 ★d8 ★l8 r87 s1 e221 z1 t0 total=367
000:SVC 00000000_000aac2c EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 ★d8 ★l8 r87 s1 e221 z1 t0 total=367
../../src/bsim/bsim -x test006-arm.emax5|jl(mapdist=1)
000:SVC 00000000_000aaa81 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 s0 i39 d6 l193 r87 s1 e221 z1 t0 total=603
000:SVC 00000000_000aab4c EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 d50 ★l69 r87 s1 e221 z1 t0 total=470
000:SVC 00000000_000aac17 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 d50 ★l69 r87 s1 e221 z1 t0 total=470
../../src/bsim/bsim -x test005-arm.emax5|jl(mapdist=0)
000:SVC 00000000_000aaa81 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 s0 i39 d6 l193 r87 s1 e221 z1 t0 total=603
000:SVC 00000000_000aab4c EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 ★c0 s0 i39 d50 l193 r87 s1 e221 z1 t0 total=592
000:SVC 00000000_000aac17 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 ★c0 s0 i39 d50 l193 r87 s1 e221 z1 t0 total=592
●bsim-20160506 … CONFシフト機能(STATUS_SCON)を追加.bsimは,#define FEATURE_SHIFT_CONF で有効化
../../src/bsim/bsim -x test005-arm.emax5|jl(mapdist=0)
000:SVC 00000000_000aaa81 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 s0 i39 d33 l219 r87 s1 e221 z1 t0 total=656
000:SVC 00000000_000aab4c EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 ★c0 s0 i39 d76 l219 r87 s1 e221 z1 t0 total=644
000:SVC 00000000_000aac17 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 ★c0 s0 i39 d76 l219 r87 s1 e221 z1 t0 total=644
../../src/bsim/bsim -x test006-arm.emax5|jl(mapdist=1)
000:SVC 00000000_000aaa81 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 s0 i39 d33 l219 r87 s1 e221 z1 t0 total=656
000:SVC 00000000_000aab4c EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 d76 ★l95 r87 s1 e221 z1 t0 total=522
000:SVC 00000000_000aac17 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 d76 ★l95 r87 s1 e221 z1 t0 total=522
../../src/bsim/bsim -x test007-arm.emax5|jl(mapdist=1,lmp,lmd)
000:SVC 00000000_000aaa88 EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 s0 i39 d33 l219 r87 s1 e221 z1 t0 total=656
000:SVC 00000000_000aab5a EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 ★d33 ★l33 r87 s1 e221 z1 t0 total=417
000:SVC 00000000_000aac2c EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c0 s2 i39 ★d33 ★l33 r87 s1 e221 z1 t0 total=417
●bsim-20160501 … test007動作(EXECとlmd/lmp同時動作)
../../src/bsim/bsim -x test005-arm.emax5|jl(mapdist=0)
000:SVC 00000000_000aaa81 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 d33 l219 r87 e221 z1 t0
000:SVC 00000000_000aab4c EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 ★c0 i39 d76 l219 r87 e221 z1 t0
000:SVC 00000000_000aac17 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 ★c0 i39 d76 l219 r87 e221 z1 t0
../../src/bsim/bsim -x test006-arm.emax5|jl(mapdist=1)
000:SVC 00000000_000aaa81 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 d33 l219 r87 e221 z1 t0
000:SVC 00000000_000aab4c EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 d76 ★l95 r87 e221 z1 t0
000:SVC 00000000_000aac17 EMAX5 start conf=00000000_00015420 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 d76 ★l95 r87 e221 z1 t0
../../src/bsim/bsim -x test007-arm.emax5|jl(mapdist=1,lmp,lmd)
000:SVC 00000000_000aaa88 EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 d33 l219 r87 e221 z1 t0
000:SVC 00000000_000aab5a EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 ★d33 ★l33 r87 e221 z1 t0
000:SVC 00000000_000aac2c EMAX5 start conf=00000000_00015440 lmmi=00000000_4ffff600 regv=00000000_4fffd600 c55 i39 ★d33 ★l33 r87 e221 z1 t0
----