Questinons about "get_class_embedding.py" #3

xiaoxiaomiao39 · 2022-04-08T06:22:27Z

Many thanks for this excellent work.
I am trying to use the dataset (https://drive.google.com/drive/folders/1Ytv02FEMk_n_qJui8-fKowr5xKZTpYWb?usp=sharing) and the pretrained PSP model (https://drive.google.com/drive/folders/1gTSghHGuwoj9gKsLc2bcUNF6ioFBpRWB?usp=sharing) you provided to get the embeddings,

 python tools/get_class_embedding.py \
--class_embedding_path=save/classs/embeddings \
--psp_checkpoint_path=pretrained/pSp/psp_animalfaces.pt \
--train_data_path=data/age_animal/animal_faces/train/ \
--test_batch_size=4 \
--test_workers=4

but it doesn't work, did I miss something?
FileNotFoundError: [Errno 2] No such file or directory: 'experiment/logs/flowers/checkpoints/iteration_80000.pt'

The text was updated successfully, but these errors were encountered:

UniBester · 2022-04-08T06:46:28Z

You should set the value of --checkpoint_path to None in options/test_options.py.

xiaoxiaomiao39 · 2022-04-08T06:55:10Z

Thanks for the quick reply :) Yes it works now.

xiaoxiaomiao39 · 2022-04-11T12:45:23Z

Thanks again for the great work.

I have two more questions, after looking into the code.

the dimension of the ocodes is [18,512], why only do the the mean subtraction for the first 6 channals?

ocodes = self.encoder(x)
odw = ocodes[:, :6] - av_codes[:, :6]
dw, A, x = self.ax(odw)
codes = torch.cat((dw + av_codes[:, :6], ocodes[:, 6:]), dim=1)

Is it important to normalize codes with respect to the center of an average face? How the performance changes if doesn't do it?

if self.opts.start_from_latent_avg:
   if self.opts.learn_in_w:
      codes = codes + self.latent_avg.repeat(codes.shape[0], 1)
   else:
      codes = codes + self.latent_avg.repeat(codes.shape[0], 1, 1)

Why split A and ni to two groups?

class Ax(nn.Module):
   def __init__(self, dim):
      super(Ax, self).__init__()
      self.A=nn.Parameter(torch.randn(6, 512, dim), requires_grad=True)
      self.encoder0=EqualLinear(512, dim)
      self.encoder1=EqualLinear(512, dim)
   def forward(self, dw):
      x0=self.encoder0(dw[:, :3])
      x0=x0.unsqueeze(-1).unsqueeze(1)
      x1=self.encoder1(dw[:, 3:6])
      x1=x1.unsqueeze(-1).unsqueeze(1)
      x=[x0.squeeze(-1),x1.squeeze(-1)]
      output_dw0=torch.matmul(self.A[:3], x0).squeeze(-1)
      output_dw1=torch.matmul(self.A[3:6], x1).squeeze(-1)
      output_dw=torch.cat((output_dw0,output_dw1),dim=1)
      return output_dw, self.A, x

For the sparse loss, why divide with 32?

class SparseLoss(nn.Module):

	def __init__(self):
		super(SparseLoss, self).__init__()
		self.theta0=0.5
		self.theta1=-1

	def forward(self, X):
		x0 = torch.sigmoid(self.theta0*X[0].abs()+self.theta1)
		x1 = torch.sigmoid(self.theta0*X[1].abs()+self.theta1)
			
		return x0.sum()/32+x1.sum()/32

During the inference stage, how to refine A to Af? I can see the A is splited into 2 groups --groups=[[0,1,2],[3,4,5]], but I don't know why?

 def sampler(outputs, dist, opts):
    means=dist['mean']
    means_abs=dist['mean_abs']
    covs=dist['cov']
    one = torch.ones_like(torch.from_numpy(means[0]))
    zero = torch.zeros_like(torch.from_numpy(means[0]))
    dws=[]
    groups=[[0,1,2],[3,4,5]]
    for i in range(means.shape[0]):
        x=torch.from_numpy(np.random.multivariate_normal(mean=means[i], cov=covs[i], size=1)).float().cuda()
        mask = torch.where(torch.from_numpy(means_abs[i])>opts.beta, one, zero).cuda()
        x=x*mask
        for g in groups[i]:
            dw=torch.matmul(outputs['A'][g], x.transpose(0,1)).squeeze(-1)
            dws.append(dw)
    dws=torch.stack(dws)
    codes = torch.cat(((opts.alpha*dws.unsqueeze(0)+ outputs['ocodes'][:, :6]), outputs['ocodes'][:, 6:]), dim=1)
    return codes

Looking forward for your reply!!

UniBester · 2022-04-14T03:58:29Z

Thank you for your attention

We only manipulate the first six layers and . Because the lower layers of styleGAN control the structure and surface attributes and the higher layers only control the hue attributes, which is not benefit for downstream tasks.
Using average latent is a trick from styleGAN.
We divide them into two groups i.e. groups=[[0,1,2],[3,4,5]] for less amount of computation and more stable sampling.
We divide sparse loss with 32 to keep it in a certain order of magnitude, which is not important.
Beta is the threshold used to refine A.

xiaoxiaomiao39 closed this as completed Apr 8, 2022

xiaoxiaomiao39 reopened this Apr 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questinons about "get_class_embedding.py" #3

Questinons about "get_class_embedding.py" #3

xiaoxiaomiao39 commented Apr 8, 2022 •

edited

Loading

UniBester commented Apr 8, 2022

xiaoxiaomiao39 commented Apr 8, 2022

xiaoxiaomiao39 commented Apr 11, 2022

UniBester commented Apr 14, 2022

Questinons about "get_class_embedding.py" #3

Questinons about "get_class_embedding.py" #3

Comments

xiaoxiaomiao39 commented Apr 8, 2022 • edited Loading

UniBester commented Apr 8, 2022

xiaoxiaomiao39 commented Apr 8, 2022

xiaoxiaomiao39 commented Apr 11, 2022

UniBester commented Apr 14, 2022

xiaoxiaomiao39 commented Apr 8, 2022 •

edited

Loading