-
Notifications
You must be signed in to change notification settings - Fork 159
Update CI test configuration #551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CI test configuration #551
Conversation
replace #549 |
test/k8s-integration/driver.go
Outdated
err = runCommand("Checking driver pods", statusCmd) | ||
if err != nil { | ||
return fmt.Errorf("failed to check driver pods: %v", err) | ||
} | ||
|
||
time.Sleep(time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this second check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I was thinking to keep "default" and add "gce-pd-csi-driver" namespace one.
I am not sure why it was checking "default" before, or just a mistake.
Also this runCommand actually does not really check pod status. It will not return error if pods are not successfully running. In the second check I added, I print out the output for debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we changed namespaces, we forgot to update the test to check in the correct namespace. So we can remove the check for default namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. I plan to have follow up PR to check pod status and print out debug log etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this repo switched to use a Deployment instead of a StatefulSet, so importing it is not going to work as is.
Maybe we can switch k/k to use Deployment too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattcary so far your pr looks good to me. In case some pod not running correctly, do we collect logs on events which help debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind, I think logs are collected #140
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, switched to deployment and moved the driver name to the exported function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there already is a deployment-waiter in framework/deployment/wait that could be used. That's what my change is using now, so there's nothing to export to you.
} | ||
nodes := strings.Fields(string(out)) | ||
for _, node := range nodes { | ||
taintCmd := exec.Command("kubectl", "taint", "node", node, "node.kubernetes.io/os:NoSchedule-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required if we're not testing mixed os clusters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only needed for windows clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should kubernetes-e2e.py be handling this as part of windows cluster bringup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why it shouldn't be part of standard cluster bringup tool? How come the end users need to do this manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will check with windows team to see whether we could remove the taint in bringup tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because there are linux pods scheduled on windows cluster, so avoid pods scheduled to windows node, we currently taint windows node. We could add node selector to pods, but I think not all pods have it. So taint windows node is the easiest way for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question is why doesn't the cluster installer that brings up the windows cluster do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I see this is removing the taint that got added. So in order to run e2e tests, we need to remove the taint so the e2e pods will run on it. But is there a reason why kubernetes_e2e.py which is a e2e boostrapping script can't do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, let's think about if we should enhance kubernetes_e2e in the future so that all test runners don't have to do this. For now, can you add a comment explaining why we need to remove the taint?
test/run-windows-k8s-integration.sh
Outdated
@@ -22,7 +21,7 @@ readonly teardown_driver=${GCE_CLUSTER_ZONE:-true} | |||
make -C ${PKGDIR} test-k8s-integration | |||
|
|||
base_cmd="${PKGDIR}/bin/k8s-integration-test \ | |||
--platform=windows --local-k8s-dir=${LOCAL_K8S_DIR} --bringup-cluster=false --teardown-cluster=false --teardown-driver=${teardown_driver}\ | |||
--platform=windows --local-k8s-dir=${k8s_dir} --bringup-cluster=false --teardown-cluster=false --teardown-driver=${teardown_driver}\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local k8s dir is only for if you're running the test from your own local machine. Do we need it for CI? k8s-integration should download kubernetes and build in a tmp directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in prow test using kubetest, kuberentes will be installed under current dir which is the pd csi driver repo. in the test, kubetest needs to be running in kubernetes dir. We don't need to download kuberentes locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubetest should be run from testDir
, not k8sDir
: https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/test/k8s-integration/main.go#L221
Maybe we need to set --test-version
here.
Btw, does kubernetes_e2e support gke clusters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there are some confusion here. For current pd driver ci tests for linux, we build kubernetes and install it under a tmp dir before running the test. But this step is not needed for the windows test setup. It uses script /workspace/scenarios/kubernetes_e2e.py which will install kuberentes like kube_up, I think. https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/gcp-compute-persistent-disk-csi-driver-windows.yaml#L65
This script will install kuberentes under current dir which is pd csi driver repo. The local-k8s-dir setting here does not mean the test will install kubernetes locally. I used this setting because the test use it change current dir here https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/test/k8s-integration/main.go#L421
So because I didn't set kubeversion or testversion, so the test code will not install kubernetes locally because it is not needed.
if len(*kubeVersion) != 0 { |
I plan to have another PR to edit the config parameter so that it is easier to read. Also I think for linux test, we can also use /workspace/scenarios/kubernetes_e2e.py to setup kubernetes instead of building it locally. We can set to use ci/latest or ci/master depending which kubernetes version we want to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set "--test-version" instead. The reason is because we may use different Kubernetes versions to 1) bring up the cluster and 2) run e2e binary
We generally use "master" as the e2e version so that we can get new test cases and test fixes without relying on them to be backported to older Kubernetes versions.
"--local-k8s-dir" is intended for developer testing where you just want to run the e2es quickly without having to download and build kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is since we already have prebuild kubernetes in different versions https://github.com/kubernetes/test-infra/blob/a4b7e151278a26fb45ee186b7fdc96394deed016/docs/kubernetes-versions.md
we could just use it by setting extract=ci/k8s-beta , ci/k8s-master , ... etc.
It will save k8s build time. Is there some specific version we want to test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think in release versions we can save a lot of time by using extract. But if we want to test head, we still need to build. We do have https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/test/k8s-integration/main.go#L40 that I think can help with this.
But still, we have 3 distinct variables we use to distinguish different use cases:
- kubeVersion to set the cluster version
- testVersion to set the e2e version
- local for local development (you're making a change in kubernetes or the e2es that you want to test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synced up offline. Will remove the need to locally download and build e2e binary in a separate PR. Instead we could set extract=ci/k8s-version.
test/run-windows-k8s-integration.sh
Outdated
readonly PKGDIR=${GOPATH}/src/sigs.k8s.io/gcp-compute-persistent-disk-csi-driver | ||
readonly LOCAL_K8S_DIR=${GOPATH}/src/k8s.io/kubernetes | ||
readonly k8s_dir=${PKGDIR}/kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PKGDIR
points to the PD CSI driver repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, in prow test, kuberentes will be installed under current dir which is the pd csi driver repo. in the test, kubetest needs to be in kubernetes dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need this variable anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it. thanks!
test/run-windows-k8s-integration.sh
Outdated
@@ -22,7 +21,7 @@ readonly teardown_driver=${GCE_CLUSTER_ZONE:-true} | |||
make -C ${PKGDIR} test-k8s-integration | |||
|
|||
base_cmd="${PKGDIR}/bin/k8s-integration-test \ | |||
--platform=windows --local-k8s-dir=${LOCAL_K8S_DIR} --bringup-cluster=false --teardown-cluster=false --teardown-driver=${teardown_driver}\ | |||
--platform=windows --local-k8s-dir=${k8s_dir} --bringup-cluster=false --teardown-cluster=false --teardown-driver=${teardown_driver}\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
teardown_driver
references GCE_CLUSTER_ZONE
. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. Thanks1
readonly PKGDIR=${GOPATH}/src/sigs.k8s.io/gcp-compute-persistent-disk-csi-driver | ||
readonly LOCAL_K8S_DIR=${GOPATH}/src/k8s.io/kubernetes | ||
readonly k8s_dir=${PKGDIR}/kubernetes | ||
readonly overlay_name="${GCE_PD_OVERLAY_NAME:-alpha}" | ||
readonly do_driver_build="${GCE_PD_DO_DRIVER_BUILD:-true}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DO_DRIVER_BUILD
no longer exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see DO_DRIVER_BUILD? It has GCE_PD_DO_DRIVER_BUILD?
07a5ea8
to
2090aee
Compare
comments were addressed. PTAL. |
2090aee
to
88a5e0d
Compare
test/k8s-integration/driver.go
Outdated
} | ||
|
||
klog.V(2).Infof("GCE PD driver pods: \n %v", string(out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, this is fixed in #140
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
} | ||
nodes := strings.Fields(string(out)) | ||
for _, node := range nodes { | ||
taintCmd := exec.Command("kubectl", "taint", "node", node, "node.kubernetes.io/os:NoSchedule-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should kubernetes-e2e.py be handling this as part of windows cluster bringup?
test/run-windows-k8s-integration.sh
Outdated
readonly PKGDIR=${GOPATH}/src/sigs.k8s.io/gcp-compute-persistent-disk-csi-driver | ||
readonly LOCAL_K8S_DIR=${GOPATH}/src/k8s.io/kubernetes | ||
readonly k8s_dir=${PKGDIR}/kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need this variable anymore?
88a5e0d
to
4259bf2
Compare
/test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows |
/retest |
/retest |
test/k8s-integration/driver.go
Outdated
@@ -71,9 +72,9 @@ func installDriver(goPath, pkgDir, stagingImage, stagingVersion, deployOverlayNa | |||
statusCmd := exec.Command("kubectl", "describe", "pods", "-n", driverNamespace) | |||
err = runCommand("Checking driver pods", statusCmd) | |||
if err != nil { | |||
return fmt.Errorf("failed to check driver pods: %v", err) | |||
return fmt.Errorf("failed to get nodes: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error message is not correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
test/k8s-integration/driver.go
Outdated
} | ||
|
||
klog.V(2).Infof("GCE PD driver pods: \n %v", string(out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out
is no longer relevant in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. PTAL
4259bf2
to
6bcb084
Compare
test/k8s-integration/driver.go
Outdated
@@ -71,9 +71,8 @@ func installDriver(goPath, pkgDir, stagingImage, stagingVersion, deployOverlayNa | |||
statusCmd := exec.Command("kubectl", "describe", "pods", "-n", driverNamespace) | |||
err = runCommand("Checking driver pods", statusCmd) | |||
if err != nil { | |||
return fmt.Errorf("failed to check driver pods: %v", err) | |||
return fmt.Errorf("failed to describe nodes: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be describe pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, not sure what I was thinking. fixed.
update ci test configuration Fix windows integration script remove env virable GCE_PD_DO_DRIVER_BUILD direct setting
6bcb084
to
1b4d17f
Compare
/test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jingxu97, msau42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jingxu97: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
update ci test configuration
Fix windows integration script
remove env virable GCE_PD_DO_DRIVER_BUILD direct setting
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: