Aleksandr Melnikov
3e1a4dd313
Refactoring code since it's empty initialization.
2020-11-10 15:58:26 -08:00
Aleksandr Melnikov
d2dfd4cc34
Updating the host placeholder.
2020-11-10 15:58:26 -08:00
Aleksandr Melnikov
2d45b0a015
Using the right yaml library to marshal.
2020-11-10 15:58:26 -08:00
Aleksandr Melnikov
cb7bb793e7
Adding code to support sidecars and accessing them.
...
- Specifically, if a sidecar is labeled with "sys-tensorboard-like",
code attempts to create services and virtual services (routes).
- Code will create argo templates that are resource type, where the
manifests are the generated resources.
The templates are then added right before the template with the
sidecar that requires these services and routes.
2020-11-10 15:58:26 -08:00
rushtehrani
875097fed7
only inject one sys-send-status task
2020-11-06 17:35:55 -08:00
Andrey Melnikov
7129fdf55f
fix: issue where s3 with gcs wasn't working because of incompatible listing of files.
2020-10-16 16:41:49 -07:00
Andrey Melnikov
20c4950b69
feat: revert jwt token from auth
2020-10-14 11:53:05 -07:00
Aleksandr Melnikov
83a2543b13
Updating function name to reflect what it's doing.
2020-10-05 14:04:32 -07:00
Aleksandr Melnikov
e8dae0f2e9
Since we're no longer relying on running nodes, we don't need logic
...
relating to them.
- We can just check if a nodeSelector is set on the template.
2020-10-05 13:59:14 -07:00
Aleksandr Melnikov
7fe0ab2654
Tweaking names so it's more clear why they are there.
2020-10-05 11:51:04 -07:00
Aleksandr Melnikov
cc2c51ace5
Removing resource requests and limits.
...
- Using hostPort on the node as a way to require dedicated nodes
for workspaces and workflows.
2020-10-05 11:46:01 -07:00
Aleksandr Melnikov
656026ac84
Refactored workflow_execution resource calculation to use the new
...
function, to put the logic into the same place.
2020-10-02 13:08:48 -07:00
Andrey Melnikov
49b9bc4f93
fix: unit tests
2020-09-30 12:54:35 -07:00
Andrey Melnikov
83a4238153
Merge pull request #617 from Vafilor/feat/workspace.listing
...
feat: workspace list filtering, sorting and statistics
2020-09-29 11:20:46 -07:00
Andrey Melnikov
77716ba56b
chore: updated method docs
2020-09-29 11:17:50 -07:00
Andrey Melnikov
8b4a70d958
feat: added support for sorting and filtering workspaces.
...
* Also added a LabelFilter interface
2020-09-28 12:09:16 -07:00
Rush Tehrani
91b97d9243
Merge pull request #611 from aleksandrmelnikov/feat/core.607-revert.to.resource.limits.requests
...
feat: Undo pod anti-affinity for scheduling nodes, use resource requests instead.
2020-09-25 14:02:02 -07:00
Aleksandr Melnikov
e7cef240c4
Fixing code that assigns a GPU limit for a workflow.
...
- This GPU assignment works for cron created workflows as well.
2020-09-25 10:38:21 -07:00
Aleksandr Melnikov
dca6db842c
Adding back injectContainerResourceQuotas for workflows.
...
- This time, the node capacity is grabbed from running nodes instead
of information from configmap (which is grabbed from params.yaml)
- Added support for two different instance-type keys.
Note that GPU support does not work at this time.
- There is no ResourceName "nvidia.com/gpu" in library code, so it throws
a deref nil error.
2020-09-24 17:28:54 -07:00
Andrey Melnikov
f7770618ca
update: added parameter to control listing of system workflows
2020-09-24 12:32:35 -07:00
Aleksandr Melnikov
77812419d2
Removing pod anti-affinity injection from workflows.
...
- Workspaces end up calling workflows. This has to be removed
as part of switching back to resource requests.
2020-09-24 11:03:37 -07:00
Andrey Melnikov
6fa123b122
feat: added explicit nulls sorting for workflow execution columns. Idea is to treat null as an empty/zero value. So in asc, it is first.
2020-09-23 19:13:53 -07:00
Andrey Melnikov
b1d0ab1d59
update: added workflow template name and uid to select of workflow executions
2020-09-23 16:20:38 -07:00
Rush Tehrani
c45231c106
Merge pull request #585 from aleksandrmelnikov/feat/core.583-workflows.use.pod.anti.affinity
...
feat: Updated Workflows, and Cron created Workflows, to ensure they get their own pod by using Pod AntiAffinity. This replaces resource limits and/or requests.
2020-09-21 16:20:13 -07:00
Aleksandr Melnikov
9ec45e4f34
Formatting feedback and fix.
2020-09-21 16:12:53 -07:00
Aleksandr Melnikov
48e2050e97
Adjusting code so that we don't need the nodePoolLabel to figure
...
out the selected Node.
- The nodePoolLabel can change with the params.yaml and different
k8s versions.
2020-09-18 16:34:07 -07:00
Aleksandr Melnikov
d42f88e04c
Adding flag to decide when to add pod affinity.
...
- This avoids adding it every time the loop finds a nodeSelector that's
not nil in a template
2020-09-18 15:53:00 -07:00
Aleksandr Melnikov
6dd7c0ac70
Adjusting the pod anti-affinity per feedback.
...
- The pod anti-affinity should be set for a template that has a nodeSelector
value (not nil).
- If the template does not have a nodeSelector, we do nothing.
2020-09-18 15:51:48 -07:00
Aleksandr Melnikov
31076bc70d
Adding systemConfig to function.
...
- Also added error return
2020-09-18 15:50:28 -07:00
Aleksandr Melnikov
aaf20b4ab6
Updating code to use the returned wf.
2020-09-17 17:43:40 -07:00
Aleksandr Melnikov
7bc1056bc6
Updating function to return the updated workflow.
2020-09-17 17:42:23 -07:00
Aleksandr Melnikov
880e8ba082
Removing prior code that injected resource requests and limits
...
to workflows.
2020-09-17 17:08:06 -07:00
Aleksandr Melnikov
ec634b66ca
Adding function to ensure a workflow gets a dedicated node for all the
...
templates it executes.
- Note that workflows executed by cron are also affected.
2020-09-17 17:06:20 -07:00
Andrey Melnikov
d524c3cb66
update: fixed issue where GetWorkflowExecutionStatisticsForNamespace included workspace workflows (is_system = true). Added support for filtering by phase.
2020-09-16 19:37:10 -07:00
Andrey Melnikov
a9953683a9
feat: added endpoint to get the workflow execution statistics for a namespace
2020-09-16 19:37:10 -07:00
Andrey Melnikov
d4d4884e5b
feat: initial pagination updates
2020-09-16 18:26:51 -07:00
Aleksandr Melnikov
029469e031
Changing the secret "key" value.
...
- Otherwise, the workflow can't get access to GCS bucket.
2020-09-08 11:10:05 -07:00
Andrey Melnikov
5d3345ded2
update: removed port since default is 80
2020-08-17 16:17:10 -07:00
Andrey Melnikov
0274785f72
fix: removed port environment variable as it is no longer needed since host is not using an environment variable
2020-08-17 16:14:30 -07:00
Andrey Melnikov
bcec3c13fd
fix: update core pod host to use string instead of environment variable which evaluates to a static ip
2020-08-17 16:07:47 -07:00
rushtehrani
ce805bfa20
feat: Support script templates in Workflow
2020-08-12 22:31:42 -07:00
Andrey Melnikov
4b4cccbd74
Merge pull request #486 from onepanelio/feat/remove-nvidia-smi-mount
...
chore: Remove nvidia-smi mount
2020-08-11 10:10:57 -07:00
Aleksandr Melnikov
d41dad0074
Merge pull request #493 from onepanelio/feat/onepanelio.labels.upgrade
...
feat: upgrade label structure to support querying with multiple labels
2020-08-10 17:01:19 -07:00
Andrey Melnikov
8586639a6c
update: cron workflows to use new labels
2020-08-09 16:15:05 -07:00
Andrey Melnikov
c81c2d7672
update: fixed issues where workspace template version labels were not being correctly set/got. Also updated generic server endpoints with new logic
2020-08-09 13:45:24 -07:00
Andrey Melnikov
f02e7791f7
update: updated labels for workflow templates and their versions
2020-08-08 15:56:17 -07:00
Andrey Melnikov
6edca5731b
update: change reader to use bufio for buffered reading
2020-08-07 15:15:25 -07:00
Andrey Melnikov
63bdb69968
fix: issue where scanner could not handle long lines while reading logs
2020-08-07 15:10:55 -07:00
rushtehrani
b2e887c1c9
remove nvidia-smi mount
2020-08-06 14:18:21 -07:00
Andrey Melnikov
3e6a48ba1e
fix: issue where running a workflow execution again failed.
2020-08-04 10:40:18 -07:00