Kogence Forum

Jump to: navigation, search

About this board

By clicking "Add topic", you agree to the terms of use for this wiki.
IyorzorBenjamin (talkcontribs)

Please, i need to know how to add a collaborator to my model. I just complained about a job that did not end well and i asked to add admin@kogence.com as a collaborator to the model. The Simulation #9165

Centaure99 (talkcontribs)

Dear @IyorzorBenjamin, SingIn to kogence.com, open your model. You will see a Collaboration tab on the top navigation bar. There you can add our support team using the email admin@kogence.com. Please give the team "admin" rights.

Admin0 (talkcontribs)

Dear @IyorzorBenjamin, Do you still need help with this?

HarilLian (talkcontribs)

Hello, the output of my .sh in quantum espresso shows the BIN_DIR not exist, I have given the technical team the access of my model, could you help me check which staff is incorrect?

Admin0 (talkcontribs)

@HarilLian

Our team is scheduled to look at this issue tomorrow. Team has bench marked Qunatum Espresso for several enterprise customers and do not foresee any issues in running your simulation properly. We will update you by end of tomorrow.

Team does tell me that it is best to run such use cases with an OS bypass network so multi node MPI can directly access memory without going through OS stack. Such workflow setup is typically part of our enterprise service but as a courtesy, team has agreed to look into doing such a setup for you. This may take 1 to 2 weeks though as team is dealing with several high priority requests.

Admin0 (talkcontribs)

Dear @HarilLian:

First of all we want to thank you for your patience.  Our apologies for the delay in getting back to you.

We had more than the usual number of service requests in our queue.

We have created two examples for you.

1/ Single node parallel simulations.

https://kogence.com/app/docs/2D_Graphene_VC_Rleax_Calculations_On_Single_Node_Using_MPI

2/ Multi node parallel simulations.

https://kogence.com/app/docs/2D_Graphene_VC_Relax_Calculations_On_Autoscaling_Clusters

You should be able to run both of these models as it is (Copy is not needed as you are added as a collaborator). Feel free to make a personal Copy to adapt it to your own use case.

We have run these models ourselves and thoroughly tested both cases.  First simulation continuously uses all 100% of all 36 CPUs 100% of time. Second simulation uses  100% of all 36 CPUs in both the nodes (i.e. total of 72 CPUs) 100% of time.

For more details:

https://kogence.com/app/docs/Help:User_Manual

https://kogence.com/app/docs/Quantum_Espresso

Please let us know if you have any further questions.

HarilLian (talkcontribs)

hi gauss

I saw your message just now but it disappear after I sign in to my page

could you please send me again your last message?

yes is it I cannot run for 30 hrs?

that is why I have the error message?

but if I didn't know when the simulation will end

how can I set the time limit?

hi gauss so how could I prevent the error that job ends due to time limit

gauss, get it

so can you help me cancel all the simulation I am running.

I wannt to change to the strategy you told me just now

if I continue to run it must show the error sign again for all my simulation

actually, I've tried to stop one of my simulation

but it doesn't response

I click stop for many time

but it doesn't work I don't know why

yes I can do it

if I log out, can I reconnnect to this chat?

I'm trying to cancel simulation #9212

but it still active

I've tried half an hour or maybe 1 hour ago

yes please

please cancel all of my simulation, so I can do some modification on these models

gauss so when will you close all of these

I'm have a due so I need to calculate the time remain for me

you can do the debugging

take your time

Centaure99 (talkcontribs)

Please go to My Usage page. Team tells me that most of your recent jobs ran for >24hrs. Some ran for ~30hrs.

Can you please match these against the time limits you selected at the launch of the jobs? The only way job can terminate before the time limit is if it ended properly or if code gave some error that should be reported on ___titusiError. If job ends due to time limit then ___titusiError may not be updated.

@HarilLian, Best way to do is to keep a larger pool of credits. Select a large time limit. We always and automatically refund credits back based on actual end time of the simulation. One note of caution. Batch mode simulations end automatically and credits get refunded automatically. But if you have opened any graphical application window inside the visualizer (other than the once that we open automatically for job monitoring), for example, Matlab desktop GUI or Graphical plots/figures, CloudShell terminal etc. then you need to make sure that you close them manually. If they remain open then job will only end when it reaches the time limit.

Hi HarilLian, you should be able to do that yourself. You will just open each simulation and press Stop button. You will see that credits will get refunded to you for the remaining amount of time that you did not use.

That is interesting. I am not seeing any system issues. And nothing has been reported to us. Can you refresh browser. Log out and log back in?

Then try stopping it one more time.

I am monitoring for any errors on backend right now.

Just wait a few seconds. I see the backend requests came and got processed.

I will monitor it. If it does not end, I will get the team to look into it. I believe it should.

So do you want me to close all your running simulations?

Sure. Will do. Thanks.

Give me 10-15mins pls. I can close right away but I want team to look at the reason so we dont run into this issue one more time. Or if you prefer I can skip debugging and just do that right away.

HarilLian (talkcontribs)

hi gauss

do you know why my simulation always show error after running for 20+ hours

I have choose enough ram

no error message at all

just show error in the running button

terminated without any output

eastern time tuesday 11PM

my time limit is 30hr

the error sign appears in the menu of "my model"

no

my program just terminated without any explaination

run button become green again

no

the output file is not shown

It just give me initial files I have

no results file and no output and error file

I am doing research

I have to run it as soon as possible

It shows error and I rerun

but I have spend over 700 dollars in this error

you can check simulation from #9201 to #9205

yes thx

I have been so tired with this error = =

Centaure99 (talkcontribs)

What error is it showing HarilLian?

Did the simulations stop or get terminated?

What time limit did you choose at the time of start? Check your email notification that you get at the time of start of your simulation.

BTW, what do you mean it shows error in the running button? Is Run Button inactive ?

I see. If your simulations are still running, then Run button should not be there. There should be Stop button and Visualizer should be active. Is that the case?

If you go to Files tab, you will see two files like ___titusiOutput and ___titusiError. Do you see those files?

Can you send me a snapshot of what you see under the Files tab of one of such models?

But that simulation is still running, HarilLian.

Can you please send me the simulation ID# and Title. I will have someone take a look.

As far as I can tell, the only reason this can happen is if time limit select at the time reached.

We will research this HarilLian.

By the way, team is just preparing documentation for your previous ticket. Should be in your way soon.

Thanks HarilLian. I will raise another ticket for this issue.

MuhammadSahal (talkcontribs)

i create new models, but i dont understand my models is not in the my models

Admin0 (talkcontribs)

@MuhammadSahal, Please make sure you did not got logged out and check that you are logged in by refreshing your browser. Let us know if you are still having issues and need further assistance.

Admin0 (talkcontribs)

@MuhammadSahal, Do you need any further help?

Abubakarba'asir (talkcontribs)

init

icant running my calculation

Admin0 (talkcontribs)

@Abubakarba'asir, Please describe your issue in more details. We aren't able to be of any help without detailed description of the issue.

Admin0 (talkcontribs)

@Abubakarba'asir, Do you need any further help?

AbdulAliFahimi (talkcontribs)

I want to run a "vc-relax" calculation in QE for a four layer heterostructure which consists of almost 300 atoms. How much cpu units do I need? And also should I go for high cpu or high ram?

Admin0 (talkcontribs)

@AbdulAliFahimi, you can run your model on a small machine. It will crash. The output file should show the amount of RAM per process that is needed by the simulation. Number of CPUs decision simply depends on the number of parallel processes you want to run. More the number of parallel processes, the faster you can finish your simulation. In terms of costs on Kogence, running 10 hrs simulation on 2 CPUs is same as running 1 hr simulation on 10 CPUs. So there is really no benefit of running simulations on smaller machines. Your decision should simply be based on the RAM you need per process.

Let ys know if you need any further help.

Admin0 (talkcontribs)

@AbdulAliFahimi, Do you need any further help?

Pejuangsarjana (talkcontribs)

why i'm can't create new model?

Admin0 (talkcontribs)

@Pejuangsarjana,

Just make sure that you are logged in. Let us know if you are still having issues? Please send some snapshots to exactly describe your issue.

Admin0 (talkcontribs)

@Pejuangsarjana, Do you still need further help? Can we close this ticket?